Package com.acumenvelocity.ath.solr.tm
Class SolrTmFilter
- java.lang.Object
-
- net.sf.okapi.common.filters.AbstractFilter
-
- com.acumenvelocity.ath.solr.tm.SolrTmFilter
-
- All Implemented Interfaces:
AutoCloseable,Iterator<net.sf.okapi.common.Event>,net.sf.okapi.common.filters.IFilter
public class SolrTmFilter extends net.sf.okapi.common.filters.AbstractFilterStreaming Solr translation memory filter designed for large-scale TM operations. Leverages Solr's deep paging capabilities through cursor marks to efficiently process millions of translation units without exhausting heap memory. This filter transforms Solr query results into Okapi event streams suitable for integration with translation processing pipelines. Documents are retrieved in configurable page sizes and processed incrementally, making it ideal for production environments with substantial translation memory databases.
-
-
Constructor Summary
Constructors Constructor Description SolrTmFilter(org.apache.solr.client.solrj.SolrClient solrClient, String tmCollection, UUID tmId)Constructs a streaming filter for a specific translation memory.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()Terminates the filter and releases associated resources.longestimateTotalSegments()Queries Solr for the total count of matching segments without retrieval.org.apache.solr.client.solrj.SolrQuerygetQuery()Provides read access to the query configuration.StringgetTmCollection()Returns the collection name being queried.UUIDgetTmId()Returns the translation memory identifier.booleanhasNext()Indicates whether more events are available in the processing stream.booleanisActive()Indicates whether the filter is currently active.net.sf.okapi.common.Eventnext()Retrieves the next event from the processing stream.voidopen(net.sf.okapi.common.resource.RawDocument input)Activates the filter and establishes the streaming connection to Solr.-
Methods inherited from class net.sf.okapi.common.filters.AbstractFilter
addConfiguration, cancel, createFilterWriter, createSkeletonWriter, getConfiguration, getConfigurations, getDisplayName, getDocumentId, getDocumentName, getEncoderManager, getEncoding, getMimeType, getName, getNewlineType, getParameters, getParameters, getParametersClassName, getParentId, getSrcLoc, getTrgLoc, isCanceled, isGenerateSkeleton, isMultilingual, open, removeConfiguration, setFilterConfigurationMapper, setMimeType, setOptions, setParameters, setParentId, setSrcLoc, setTrgLoc
-
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface java.util.Iterator
forEachRemaining, remove
-
-
-
-
Constructor Detail
-
SolrTmFilter
public SolrTmFilter(org.apache.solr.client.solrj.SolrClient solrClient, String tmCollection, UUID tmId)Constructs a streaming filter for a specific translation memory.- Parameters:
solrClient- Connection to the Solr instancetmCollection- Target translation memory collectiontmId- Translation memory identifier to filter by- Throws:
IllegalArgumentException- if any required parameter is null
-
-
Method Detail
-
open
public void open(net.sf.okapi.common.resource.RawDocument input)
Activates the filter and establishes the streaming connection to Solr. Validates connectivity before initializing the document iterator.- Parameters:
input- Raw document wrapper providing filter context- Throws:
net.sf.okapi.common.exceptions.OkapiException- if Solr connectivity fails
-
hasNext
public boolean hasNext()
Indicates whether more events are available in the processing stream.- Returns:
- true if additional events can be retrieved
-
next
public net.sf.okapi.common.Event next()
Retrieves the next event from the processing stream. Emits document boundary markers and text unit events in sequence.- Returns:
- The next available event
- Throws:
NoSuchElementException- when the stream is depleted
-
close
public void close()
Terminates the filter and releases associated resources. The Solr client remains open as it's externally managed.- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfacenet.sf.okapi.common.filters.IFilter- Overrides:
closein classnet.sf.okapi.common.filters.AbstractFilter
-
estimateTotalSegments
public long estimateTotalSegments() throws net.sf.okapi.common.exceptions.OkapiExceptionQueries Solr for the total count of matching segments without retrieval. Useful for displaying progress indicators or estimating resource needs.- Returns:
- Total segment count matching the query
- Throws:
net.sf.okapi.common.exceptions.OkapiException- if the count operation fails
-
getTmCollection
public String getTmCollection()
Returns the collection name being queried.- Returns:
- Solr collection identifier
-
getTmId
public UUID getTmId()
Returns the translation memory identifier.- Returns:
- TM ID being filtered
-
getQuery
public org.apache.solr.client.solrj.SolrQuery getQuery()
Provides read access to the query configuration.- Returns:
- Copy of the configured query
-
isActive
public boolean isActive()
Indicates whether the filter is currently active.- Returns:
- true if filter is operational
-
-