Documentation to understand the Shuffle architecture and thoughts behind our choices. Important to understand if you want to contribute or decide whether it works for your organization.
With a long-term vision of having an Open(API) ecosystem with a hybrid model between cloud and on-prem, this document will be a guide to understand some underlying aspects of Shuffle and how things fit together. Shuffle does NOT require internet to work.
Shuffle Installation models:
The platform is split into two main parts: Server and Workers. The server acts as the host of everything from API activity to Workflow validation, while the Workers are another standalone unit, working in a microservice-esque way. The top and bottom part can be installed on different hosts and be clustered.
Shuffle uses and is built upon existing, well established frameworks to help the Security community move forward, rather than just increase complexity.
The Shuffle automation engine is entirely built from scratch, relying heavily on Docker and Micro-service, real-time executions. The code for decision making of next nodes can be found here, while the SDK for how Apps perform can be found here. Read more here
The fundamental building blocks of Shuffle are all designed to be modular Docker images, meaning they can run separately in different environments. The list below contains all the necessary parts to execute a workflow. In case you want to contribute, we've added the programming languages as well.
The reason behind the usage of Golang is simple: Stability. Scripting languages like Python are prone to crashing, while Golang is fun, stable and easy to understand.
Type | Technology | Note |
---|---|---|
Frontend | ReactJS | Cytoscape graphs & Material design |
Backend | Golang | Rest API that connects all the different parts |
Database | Opensearch | A scalable, NoSQL database used as document store of everything. |
Orborus | Golang | Runs workers in a specific environment to connect locations |
Worker | Golang | Deploys Apps to run Actions defined in a workflow |
app sdk | Python | Used by Apps to talk to the backend |
Shuffle uses semantic versioning. All Docker images can be found on Github or on Dockerhub.
Shuffle is a quite complex platform, with lots of different features to handle anything necessary for automation. We are an API-first platform, meaning we always build the API for the feature, before developing the frontend. This allows us to quickly prototype and release new features, without necessarily breaking it. Most of our API endpoints use the following model to authentication and authorize whether the user has access:
The data gathered is processed, before Shuffle returns either of these codes (with certain exceptions):
If there's a failure, Shuffle should ALWAYS return with data in this format:
{"success": False, "reason": "Here's the reason it didn't work"}
The frontend takes the data and shows it in the UI.
There are two types of authentication tokens in Shuffle: API/Session access, and App authentication.
Accessing data is done PER ORGANIZATION, based on which one your user is in. This means that you need to be a part of the organization you're using the API for, and it will by default use this one unless you use Org-Id header in your request. If you are an Org admin, you have access to all information within an Organization, while Org-Users has access to read and modify Workflows, Apps, Files, Datastore and Trigger management. If you are a Org-Reader, you can READ the same data the Org User can read and modify.
Hashed (bcrypt):
Encrypted (AES-256):
App authentication and Files are being encrypted. The seed used for hashing is random for each organization, and can be set with the environment variable SHUFFLE_ENCRYPTION_MODIFIER in the local version of Shuffle. This is automatically handled in our SaaS offering. How it works:
1. Create md5 hash from Org ID + Workflow_id + Auth timestamp + SHUFFLE_ENCRYPTION_MODIFIER
2. Encrypt the authentication value with [aes.NewCipher](https://cs.opensource.google/go/go/+/go1.17.1:src/crypto/aes/cipher.go;l=32)
3. Base64 encode the encrypted value (because bytes and strings aren't friends)
4. Store the value in the database.
Run in reverse to decrypt and retrieve the values.
Encryption Code reference
There are multiple ways to access the API. The first is through the UI and a logged in user. The second is through the API directly with a Bearer token. The third is from a workflow execution.
Session management in Shuffle is currently quite basic, with the goal to drastically improve it in 2024. The current flowchart includes reusing the same session token across anyone using an account, with the only way to log out everyone else being to log our yourself. The goal is to make this use one session token per device.
App authentication is how we authenticate and store an app's configuration. If an app requires authentication, and the user adds the authentication credentials, these credentials will be encrypted and stored in the database, along with being cached in their encrypted form (AES-256). These values can and should NEVER be decrypted to be seen by a process or human other than during a workflow execution.
How are these values being used then? If they're encrypted, how does the app get access to them? Here's how:
The execution model of Shuffle can be defined as such:
A worker is started FOR EACH EXECUTION, which is in control of the entire duration of an execution. It has two modes:
Unoptimized: If a workflow is started manually, the Worker will periodically poll the backend for updates, pointing all apps to send direct information to the backend. This makes it possible for the user to see updates in real-time for debug purposes.
Optimized: If a Workflow is started from a trigger (not manually), the Worker starts an HTTP server, and acts as a temporary backend for this specific execution. This makes it possible to communicate and deploy Apps faster, without straining the backend. When the workflow is finished, it will send the full execution to the backend. This means the frontend MAY not have the full picture until after the workflow execution finishes.
The Worker attempts starting each App's Docker container, starting with the startnode. As it finds that a node has finished, it will check it's status, before starting the following nodes upon success. If the App's Docker image doesn't exist, it will attempt to download in the order of: Backend, Dockerhub. If it doesn't exist, abort the workflow.
The apps that were started by a Worker retrieves the full execution in order to be able to identify variables, check conditions, authorization, download files or from the cache etc. Order of operations in an App's Docker container:
When the Execution in the Worker is in either the FINISHED or ABORTED state, the Worker sends information back to the original Backend about the status of the execution.
To learn about the code behind the execution, check here
As we've hinted at more than a few times, everything in Shuffle is built around Docker containers. This keeps the environment safe even in cases of compromise, and is necessary to allow a user to build an app for pretty much anything. This allows us to sandbox and control how actions are performed, as it's all built on top of our App SDK. On top of all this? It makes sharing and deploying easier between Shuffle users, which is paramount to a scalable, global system.
The most important parts about containerization in Shuffle is this:
Contact us or mail us at help@shuffler.io, and we'll provide you with the necessary information.