This property specifies what cube build levels should be sampled before assigning MapReduce jobs. Kyvos runs a dedicated level job for each dimension during the Full/Incremental cube build jobs. The level jobs are numbered from 1 to N+2 where N is the number of materialized dimensions (Read: kyvos.build.dimensions.materialize). While level 1 and level 2 perform some pre-processing, the level 3 onward jobs are dedicated to building aggregations for each materialized dimension in order (Read: kyvos.build.dimension.order).
Values and Behavior:
Comma-separated positive integers (for example, 2,3,4). Level 1 cannot be sampled.
For example, if the order of dimensions in Full cube build is D1, D2, D3, D4, D5. Then setting the property value to 2 means to sample/profile the data before building the aggregations for the dimension of the level 3 job i.e. D1. Property value = 2,3 to sample/profile the data before building the aggregations for the dimension of level 3 and level 4 job i.e. D1 and D2
- Connection: If the property is set at the connection, then the property value is applicable for all Full/Incremental cube build jobs.
- Cube: If the property is set on a cube, then the value will override the connection level value for that cube’s build job.
Comes into effect:
The value of the property can be changed at any time and will be respected in the next build.
Dependencies and related properties:
- Value for kyvos.build.execution.engine property must be set as MapReduce
Generally, all dimensions should be sampled. If the volume is sufficiently small at higher levels, it is possible to turn off sampling on those levels (by removing the level numbers from the property). However, the performance gain will be minimal as the sampling takes a very short amount of time. Avoiding sampling may lead to an increase in cube build time and/or query performance because of skewed aggregation files.