---
title: "Operators"
nav-parent_id: python_datastream_api
nav-pos: 20
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.
-->


Operators transform one or more DataStreams into a new DataStream. Programs can combine multiple transformations into 
sophisticated dataflow topologies.

* This will be replaced by the TOC
{:toc}

# DataStream Transformations

DataStream programs in Flink are regular programs that implement transformations on data streams (e.g., mapping, 
filtering, reducing). Please see [operators]({% link dev/stream/operators/index.md %}
?code_tab=python) for an overview of the available stream transformations in Python DataStream API.

# Functions
Most transformations require a user-defined function as input to define the functionality of the transformation. The 
following describes different ways of defining user-defined functions.

## Implementing Function Interfaces
Different Function interfaces are provided for different transformations in the Python DataStream API. For example, 
`MapFunction` is provided for the `map` transformation, `FilterFunction` is provided for the `filter` transformation, etc.
Users can implement the corresponding Function interface according to the type of the transformation. Take MapFunction for
instance: 
<p>
{% highlight python %}
# Implementing MapFunction
class MyMapFunction(MapFunction):
    
    def map(self, value):
        return value + 1
        
data_stream = env.from_collection([1, 2, 3, 4, 5], type_info=Types.INT())
mapped_stream = data_stream.map(MyMapFunction(), output_type=Types.INT())
{% endhighlight %}
</p>
<span class="label label-info">Note</span> In Python DataStream API, users can specify the output type information of the transformation explicityly. If not 
specified, the output type will be `Types.PICKLED_BYTE_ARRAY` so that data will be in a form of byte array generated by 
the pickle seriallizer. For more details about the `Pickle Serialization`, please refer to [DataTypes]({% link dev/python/datastream-api-users-guide/data_types.md
 %}#pickle-serialization).

## Lambda Function
As shown in the following example, all the transformations can also accept a lambda function to define the functionality of the transformation:
<p>
{% highlight python %}
data_stream = env.from_collection([1, 2, 3, 4, 5], type_info=Types.INT())
mapped_stream = data_stream.map(lambda x: x + 1, output_type=Types.INT())
{% endhighlight %}
</p>
<span class="label label-info">Note</span> Operations ConnectedStream.map() and ConnectedStream.flat_map() do not support
lambda function and must accept `CoMapFunction` and `CoFlatMapFunction` seperately.

## Python Function
Users can also use Python function:
<p>
{% highlight python %}
def my_map_func(value):
    return value + 1

data_stream = env.from_collection([1, 2, 3, 4, 5], type_info=Types.INT())
mapped_stream = data_stream.map(my_map_func, output_type=Types.INT())
{% endhighlight %}
</p> 
