Page tree
Skip to end of metadata
Go to start of metadata

Description:

This property specifies how to handle small file splits of the data source. Small files refer to files smaller than the DFS block size. This allows them to be handled by a single mapper instead of using one for each file.

Values and behavior:

  • true - Combines smaller files into a single file.

  • false - Doesn't combine any files.

Default value:

false

Scope:

  • Connection: If the property is set at the connection, then the property value is applicable for all dataset build, and cube build jobs.

  • Cube: If the property is set on a cube, then the value will override the connection level value for that cube’s build job.

  • Dataset: If the property is set on a dataset, then the value will override the cube level value for that cube’s dataset build job.

NOTE: If the property is set on a dataset and a dataset is built, then the value will override the connection level value for that dataset build job.

Comes into effect:

The value of the property can be changed at any time and will be respected in the next build instance.

Dependencies and Related Properties:

  • Value for kyvos.build.execution.engine property must be set as MapReduce

Recommendation:

This property should be set to True when input data consists of a large number of small files. By combining small files, mappers will operate more efficiently, and the total number of mappers required is reduced, in turn reducing the build time.

  • No labels