Class Plan
- java.lang.Object
-
- org.apache.flink.api.common.Plan
-
-
Field Summary
Fields Modifier and Type Field Description protected HashMap<String,DistributedCache.DistributedCacheEntry>cacheFileHash map for files in the distributed cache: registered name to cache entry.protected intdefaultParallelismThe default parallelism to use for nodes that have no explicitly specified parallelism.protected ExecutionConfigexecutionConfigConfig object for runtime execution parameters.protected StringjobNameThe name of the job.protected List<GenericDataSinkBase<?>>sinksA collection of all sinks in the plan.
-
Constructor Summary
Constructors Constructor Description Plan(Collection<? extends GenericDataSinkBase<?>> sinks)Creates a new program plan, describing the data flow that ends at the given data sinks.Plan(Collection<? extends GenericDataSinkBase<?>> sinks, int defaultParallelism)Creates a new program plan with the given default parallelism, describing the data flow that ends at the given data sinks.Plan(Collection<? extends GenericDataSinkBase<?>> sinks, String jobName)Creates a new dataflow plan with the given name, describing the data flow that ends at the given data sinks.Plan(Collection<? extends GenericDataSinkBase<?>> sinks, String jobName, int defaultParallelism)Creates a new program plan with the given name and default parallelism, describing the data flow that ends at the given data sinks.Plan(GenericDataSinkBase<?> sink)Creates a new program plan with single data sink.Plan(GenericDataSinkBase<?> sink, int defaultParallelism)Creates a new program plan with single data sink and the given default parallelism.Plan(GenericDataSinkBase<?> sink, String jobName)Creates a new program plan with the given name, containing initially a single data sink.Plan(GenericDataSinkBase<?> sink, String jobName, int defaultParallelism)Creates a new program plan with the given name and default parallelism, containing initially a single data sink.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaccept(Visitor<Operator<?>> visitor)Traverses the job depth first from all data sinks on towards the sources.voidaddDataSink(GenericDataSinkBase<?> sink)Adds a data sink to the set of sinks in this program.Set<Map.Entry<String,DistributedCache.DistributedCacheEntry>>getCachedFiles()Return the registered cached files.Collection<? extends GenericDataSinkBase<?>>getDataSinks()Gets all the data sinks of this job.intgetDefaultParallelism()Gets the default parallelism for this job.ExecutionConfiggetExecutionConfig()Gets the execution config object.ConfigurationgetJobConfiguration()JobIDgetJobId()Gets the ID of the job that the dataflow plan belongs to.StringgetJobName()Gets the name of this job.intgetMaximumParallelism()StringgetPostPassClassName()Gets the optimizer post-pass class for this job.voidregisterCachedFile(String name, DistributedCache.DistributedCacheEntry entry)Register cache files at program level.voidsetDefaultParallelism(int defaultParallelism)Sets the default parallelism for this plan.voidsetExecutionConfig(ExecutionConfig executionConfig)Sets the runtime config object defining execution parameters.voidsetJobConfiguration(Configuration jobConfiguration)voidsetJobId(JobID jobId)Sets the ID of the job that the dataflow plan belongs to.voidsetJobName(String jobName)Sets the jobName for this Plan.
-
-
-
Field Detail
-
sinks
protected final List<GenericDataSinkBase<?>> sinks
A collection of all sinks in the plan. Since the plan is traversed from the sinks to the sources, this collection must contain all the sinks.
-
jobName
protected String jobName
The name of the job.
-
defaultParallelism
protected int defaultParallelism
The default parallelism to use for nodes that have no explicitly specified parallelism.
-
cacheFile
protected HashMap<String,DistributedCache.DistributedCacheEntry> cacheFile
Hash map for files in the distributed cache: registered name to cache entry.
-
executionConfig
protected ExecutionConfig executionConfig
Config object for runtime execution parameters.
-
-
Constructor Detail
-
Plan
public Plan(Collection<? extends GenericDataSinkBase<?>> sinks, String jobName)
Creates a new dataflow plan with the given name, describing the data flow that ends at the given data sinks.If not all of the sinks of a data flow are given to the plan, the flow might not be translated entirely.
- Parameters:
sinks- The collection will the sinks of the job's data flow.jobName- The name to display for the job.
-
Plan
public Plan(Collection<? extends GenericDataSinkBase<?>> sinks, String jobName, int defaultParallelism)
Creates a new program plan with the given name and default parallelism, describing the data flow that ends at the given data sinks.If not all of the sinks of a data flow are given to the plan, the flow might not be translated entirely.
- Parameters:
sinks- The collection will the sinks of the job's data flow.jobName- The name to display for the job.defaultParallelism- The default parallelism for the job.
-
Plan
public Plan(GenericDataSinkBase<?> sink, String jobName)
Creates a new program plan with the given name, containing initially a single data sink.If not all of the sinks of a data flow are given, the flow might not be translated entirely, but only the parts of the flow reachable by traversing backwards from the given data sinks.
- Parameters:
sink- The data sink of the data flow.jobName- The name to display for the job.
-
Plan
public Plan(GenericDataSinkBase<?> sink, String jobName, int defaultParallelism)
Creates a new program plan with the given name and default parallelism, containing initially a single data sink.If not all of the sinks of a data flow are given, the flow might not be translated entirely, but only the parts of the flow reachable by traversing backwards from the given data sinks.
- Parameters:
sink- The data sink of the data flow.jobName- The name to display for the job.defaultParallelism- The default parallelism for the job.
-
Plan
public Plan(Collection<? extends GenericDataSinkBase<?>> sinks)
Creates a new program plan, describing the data flow that ends at the given data sinks. The display name for the job is generated using a timestamp.If not all of the sinks of a data flow are given, the flow might not be translated entirely, but only the parts of the flow reachable by traversing backwards from the given data sinks.
- Parameters:
sinks- The collection will the sinks of the data flow.
-
Plan
public Plan(Collection<? extends GenericDataSinkBase<?>> sinks, int defaultParallelism)
Creates a new program plan with the given default parallelism, describing the data flow that ends at the given data sinks. The display name for the job is generated using a timestamp.If not all of the sinks of a data flow are given, the flow might not be translated entirely, but only the parts of the flow reachable by traversing backwards from the given data sinks.
- Parameters:
sinks- The collection will the sinks of the data flow.defaultParallelism- The default parallelism for the job.
-
Plan
public Plan(GenericDataSinkBase<?> sink)
Creates a new program plan with single data sink. The display name for the job is generated using a timestamp.If not all of the sinks of a data flow are given to the plan, the flow might not be translated entirely.
- Parameters:
sink- The data sink of the data flow.
-
Plan
public Plan(GenericDataSinkBase<?> sink, int defaultParallelism)
Creates a new program plan with single data sink and the given default parallelism. The display name for the job is generated using a timestamp.If not all of the sinks of a data flow are given to the plan, the flow might not be translated entirely.
- Parameters:
sink- The data sink of the data flow.defaultParallelism- The default parallelism for the job.
-
-
Method Detail
-
addDataSink
public void addDataSink(GenericDataSinkBase<?> sink)
Adds a data sink to the set of sinks in this program.- Parameters:
sink- The data sink to add.
-
getDataSinks
public Collection<? extends GenericDataSinkBase<?>> getDataSinks()
Gets all the data sinks of this job.- Returns:
- All sinks of the program.
-
getJobName
public String getJobName()
Gets the name of this job.- Returns:
- The name of the job.
-
setJobName
public void setJobName(String jobName)
Sets the jobName for this Plan.- Parameters:
jobName- The jobName to set.
-
getJobId
public JobID getJobId()
Gets the ID of the job that the dataflow plan belongs to. If this ID is not set, then the dataflow represents its own independent job.- Returns:
- The ID of the job that the dataflow plan belongs to.
-
setJobId
public void setJobId(JobID jobId)
Sets the ID of the job that the dataflow plan belongs to. If this ID is set tonull, then the dataflow represents its own independent job.- Parameters:
jobId- The ID of the job that the dataflow plan belongs to.
-
getDefaultParallelism
public int getDefaultParallelism()
Gets the default parallelism for this job. That degree is always used when an operator is not explicitly given a parallelism.- Returns:
- The default parallelism for the plan.
-
setDefaultParallelism
public void setDefaultParallelism(int defaultParallelism)
Sets the default parallelism for this plan. That degree is always used when an operator is not explicitly given a parallelism.- Parameters:
defaultParallelism- The default parallelism for the plan.
-
getPostPassClassName
public String getPostPassClassName()
Gets the optimizer post-pass class for this job. The post-pass typically creates utility classes for data types and is specific to a particular data model (record, tuple, Scala, ...)- Returns:
- The name of the class implementing the optimizer post-pass.
-
getExecutionConfig
public ExecutionConfig getExecutionConfig()
Gets the execution config object.- Returns:
- The execution config object.
-
setExecutionConfig
public void setExecutionConfig(ExecutionConfig executionConfig)
Sets the runtime config object defining execution parameters.- Parameters:
executionConfig- The execution config to use.
-
accept
public void accept(Visitor<Operator<?>> visitor)
Traverses the job depth first from all data sinks on towards the sources.- Specified by:
acceptin interfaceVisitable<Operator<?>>- Parameters:
visitor- The visitor to be called with this object as the parameter.- See Also:
Visitable.accept(Visitor)
-
registerCachedFile
public void registerCachedFile(String name, DistributedCache.DistributedCacheEntry entry) throws IOException
Register cache files at program level.- Parameters:
entry- contains all relevant informationname- user defined name of that file- Throws:
IOException
-
getCachedFiles
public Set<Map.Entry<String,DistributedCache.DistributedCacheEntry>> getCachedFiles()
Return the registered cached files.- Returns:
- Set of (name, filePath) pairs
-
getMaximumParallelism
public int getMaximumParallelism()
-
setJobConfiguration
public void setJobConfiguration(Configuration jobConfiguration)
-
getJobConfiguration
public Configuration getJobConfiguration()
-
-