---
title: "Production Readiness Checklist"
nav-parent_id: ops
nav-pos: 10
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.
-->

The production readiness checklist provides an overview of configuration options that should be carefully considered before bringing an Apache Flink job into production. 
While the Flink community has attempted to provide sensible defaults for each configuration, it is important to review this list and ensure the options chosen are sufficient for your needs. 

* ToC
{:toc}

### Set An Explicit Max Parallelism

The max parallelism, set on a per-job and per-operator granularity, determines the maximum parallelism to which a stateful operator can scale.
There is currently **no way to change** the maximum parallelism of an operator after a job has started without discarding that operators state. 
The reason maximum parallelism exists, versus allowing stateful operators to be infinitely scalable, is that it has some impact on your application's performance and state size.
Flink has to maintain specific metadata for its ability to rescale state which grows linearly with max parallelism.
In general, you should choose max parallelism that is high enough to fit your future needs in scalability, while keeping it low enough to maintain reasonable performance.

{% panel **Note:** Maximum parallelism must fulfill the following conditions: `0 < parallelism  <= max parallelism <= 2^15` %}

You can explicitly set maximum parallelism by using `setMaxParallelism(int maxparallelism)`. 
If no max parallelism is set Flink will decide using a function of the operators parallelism when the job is first started:

- `128` : for all parallelism <= 128.
- `MIN(nextPowerOfTwo(parallelism + (parallelism / 2)), 2^15)` : for all parallelism > 128.

### Set UUIDs For All Operators

As mentioned in the documentation for [savepoints]({% link ops/state/savepoints.md %}), users should set uids for each operator in their `DataStream`.
Uids are necessary for Flink's mapping of operator states to operators which, in turn, is essential for savepoints.
By default, operator uids are generated by traversing the JobGraph and hashing specific operator properties.
While this is comfortable from a user perspective, it is also very fragile, as changes to the JobGraph (e.g., exchanging an operator) results in new UUIDs.
To establish a stable mapping, we need stable operator uids provided by the user through `setUid(String uid)`.

### Choose The Right State Backend

See the [description of state backends]({% link ops/state/state_backends.md %}#choose-the-right-state-backend) for choosing the right one for your use case.

### Configure JobManager High Availability

The JobManager serves as a central coordinator for each Flink deployment, being responsible for both scheduling and resource management of the cluster.
It is a single point of failure within the cluster, and if it crashes, no new jobs can be submitted, and running applications will fail. 

Configuring [High Availability]({% link deployment/ha/index.md %}), in conjunction with Apache Zookeeper, allows for a swift recovery and is highly recommended for production setups. 


{% top %}
