Pepper
3.6.0
A highly extensible plattform for conversion and manipulationoflinguisticdata.
|
Public Member Functions | |
CorpusPathResolver (final URI corpusPath) throws FileNotFoundException | |
Collection< String > | sampleFileContent (final String... fileEndings) |
Returns {@value NUMBER_OF_SAMPLED_LINES} lines of a sampled set of {@value NUMBER_OF_SAMPLED_FILES} files having the ending specified by fileEndings recursively from specified corpus path. More... | |
Collection< String > | sampleFileContent (int numberOfSampledFiles, int numberOfSampledLines, final String... fileEndings) |
Returns fileEndings lines of a sampled set of numberOfSampledLines files having the ending specified by fileEndings recursively from specified corpus path. More... | |
Static Public Attributes | |
static final int | NUMBER_OF_SAMPLED_FILES = 20 |
The number of files which are read for sampling when invoking findAppropriateImporters(URI). | |
static final int | NUMBER_OF_SAMPLED_LINES = 10 |
The number of lines in a file which are read for sampling when invoking findAppropriateImporters(URI). | |
Protected Member Functions | |
void | setCorpusPath (final URI corpusPath) throws FileNotFoundException |
Multimap< String, File > | groupFilesByEnding (final URI corpusPath) throws FileNotFoundException |
Groups files for their file ending into a multimap. More... | |
Collection< FileContent > | getXFilesWithExtension (int numOfFiles, int numOfLinesToRead, final String fileEnding) |
Collection< File > | sampleFiles (final Collection< File > files, int numberOfSampledFiles) |
Creates a sampled set of numberOfSampledFiles files recursively from directory dir with specified endings. More... | |
String | readFirstLines (final File file, final int numOfLinesToRead) |
Reads the first X lines of the passed file and returns them as a String. More... | |
Protected Attributes | |
Multimap< String, File > | unreadFilesGroupedByExtension |
Multimap< String, FileContent > | readFilesGroupedByExtension |
|
protected |
Groups files for their file ending into a multimap.
The key is the ending.
corpusPath |
FileNotFoundException |
|
protected |
Reads the first X lines of the passed file and returns them as a String.
corpusPath | path to file |
lines | number of lines |
Collection<String> org.corpus_tools.pepper.impl.CorpusPathResolver.sampleFileContent | ( | final String... | fileEndings | ) |
Returns {@value NUMBER_OF_SAMPLED_LINES} lines of a sampled set of {@value NUMBER_OF_SAMPLED_FILES} files having the ending specified by fileEndings
recursively from specified corpus path.
fileEnding | ending to be considered. If no endings specified, all files are considered |
Collection<String> org.corpus_tools.pepper.impl.CorpusPathResolver.sampleFileContent | ( | int | numberOfSampledFiles, |
int | numberOfSampledLines, | ||
final String... | fileEndings | ||
) |
Returns fileEndings
lines of a sampled set of numberOfSampledLines
files having the ending specified by fileEndings
recursively from specified corpus path.
numberOfSampledFiles | number of files to be read |
numberOfSampledLines | number of lines to be read |
fileEnding | ending to be considered. If no endings specified, all files are considered |
numberOfSampledLines
files
|
protected |
Creates a sampled set of numberOfSampledFiles
files recursively from directory dir
with specified endings.
dir | the directory to be traversed recursively |
numberOfSampledFiles | number of files to be sampled |
fileEndings | endings of files to be sampled |
endings
in directory dir