As an open source alternative to Segment, RudderStack collects and routes event stream (or clickstream) data and automatically builds your customer data lake on your data warehouse.
[RudderStack][2] is an open source, warehouse-first customer data pipeline. It collects and routes event stream (or clickstream) data and automatically builds your customer data lake on your data warehouse.
RudderStack is commonly known as the open source alternative to the customer data platform (CDP),[Segment][3]. It provides a more secure, flexible, and cost-effective solution in comparison. You get all the CDP functionality with added security and full ownership of your customer data.
Warehouse-first tools like RudderStack are architected to build functional data lakes in the user's data warehouse. The benefits are improved data control, increased flexibility in tool use, and (frequently) lower costs. Since it's open source, you can see how complicated processes—like building your identity graph—are done without relying on a vendor's black box.
### Getting the RudderStack workspace token
Before you get started, you will need the RudderStack workspace token from your RudderStack dashboard.To get it:
*If you plan to use RudderStack in production, we strongly recommend using this method.* This is because the Docker images are updated with bug fixes more frequently than the GitHub repository (which follows a monthly release cycle).
This deploys RudderStack on your default Kubernetes cluster configured with kubectl using the workspace token you obtained from the RudderStack dashboard.
For more details on the configurable parameters in the RudderStack Helm chart or updating the versions of the images used, consult the [documentation][14].
2. In this tutorial, you will verify RudderStack by sending test events to Google Analytics. Make sure you have a Google Analytics account and keep the tracking ID handy. Also, note that the Google Analytics account needs to have a `Web` property.
3. In the [RudderStack hosted control plane][16]:
* Add a source on the RudderStack dashboard by following the [Adding a source and destination in RudderStack][17] guide. You can use either of RudderStack's event stream software development kits (SDKs) for sending events from your app. This example sets up the [JavaScript SDK][18] as a source on the dashboard. Note: You aren't actually installing the RudderStack JavaScript SDK on your site in this step; you are just creating the source in RudderStack.
* Configure a Google Analytics destination on the RudderStack dashboard using the instructions in the guide mentioned previously. Use the Google Analytics tracking ID you kept from step 2 of this section:
6. Finally, log into your Google Analytics account and verify that the events were delivered. In your Google Analytics account, navigate to *RealTime** -> **Events**. The RealTime view is important because some dashboards can take one to two days to refresh.
RudderStack's core architecture contains two major components: the data plane and the control plane. The data plane, [rudder-server][29], delivers your event data, and the RudderStack hosted control plane manages the configuration of your sources and destinations.
However, if you want to manage the source and destination configurations locally, you can set an open source control plane in your environment using the RudderStack Config Generator. (You must have [Node.js][30] installed on your system to use it.)
You should now be able to access the open source control plane at `http://localhost:3000` by default. If your setup is successful, you will see the user interface.
The core of RudderStack is in the [rudder-server][33] repository. It is open source, licensed under [AGPL-3.0][34]. A majority of the destination integrations live in the [rudder-transformer][35] repository. They are open source as well, licensed under the [MIT License][36]. The SDKs and instrumentation repositories, several tool and utility repositories, and even some [dbt][37] model repositories for use-cases like customer journey analysis and sessionization for the data residing in your data warehouse are open source, licensed under the MIT License, and available in the [GitHub repository][38].
You can use RudderStack's open source offering, rudder-server, on your platform of choice. There are setup guides for [Docker][39], [Kubernetes][40], [native installation][41], and [developer machines][42].
RudderStack also offers a managed option, [RudderStack Cloud][43]. It is fast, reliable, and highly scalable with a multi-node architecture and sophisticated error-handling mechanism. You can hit peak event volume without worrying about downtime, loss of events, or latency.