Synthloom — Step-by-Step Tutorial
This guide walks you through every section of the Synthloom application. Follow the steps in order for your first run, or jump directly to any topic using the sidebar.
The core loop
Login
Synthloom uses token-based authentication. Every user must log in before accessing any feature.
-
Open the application
Navigate to
https://www.synthloom.in. You will be redirected to the Login page automatically. -
Enter your credentials
Type your username and password in the form fields.
-
Click Sign In
On success, you are redirected to the Workspace Selector. Your session token is stored securely in memory for the duration of your browser session.
Workspaces
Workspaces are isolated project containers. Each workspace has its own configurations, generation history, and output files — completely separate from other workspaces.
Creating a workspace
-
Open the Workspace Selector
After login you land here automatically, or click your workspace name in the top navigation bar at any time.
-
Click "New Workspace"
Enter a descriptive name (e.g. Retail Demo, Healthcare Q2). Names help you identify the project at a glance.
-
Confirm creation
The new workspace is created and becomes your active context immediately. All subsequent actions apply to this workspace.
Other workspace actions
| Action | How |
|---|---|
| Switch workspace | Click any workspace card to make it active |
| Rename workspace | Click the pencil icon next to the workspace name |
| Share workspace | Click the share icon → enter a colleague's email and choose a role (Viewer / Editor) |
| Delete workspace | Click the trash icon — this is permanent and removes all configurations and output |
Defining Your Data Model
The Design Studio is where you describe your data model — the entities you want to generate, the fields they contain, and how they relate to each other.
Opening the Design Studio
-
Navigate to "Design Studio"
Click Design Studio in the left navigation menu of the application.
-
Start a new config or load an existing one
Click New to start from scratch, or select a previously saved configuration from the dropdown. You can also import a JSON file.
Adding an entity
-
Click "Add Entity"
An entity represents a table or dataset — for example Customers, Orders, or Products. Give it a clear singular name.
-
Set the record count
Enter how many rows you want to generate for this entity (e.g.
1000customers,50000orders).
Adding fields to an entity
-
Click "+ Add Field"
A new field row appears. Give it a name matching your target schema (e.g.
customer_id,email,created_at). -
Choose a field type
Click the type picker to open the full type menu. Types are organized into groups:
Group Examples Identity UUID, Integer, Auto Increment Personal Full Name, First Name, Email, Phone Location City, State, Country, Zip Code, Address Date & Time Date, DateTime, Timestamp, Year Financial Price, Currency, Amount, IBAN Internet URL, IP Address, User Agent Business Company, Job Title, Department Text Paragraph, Sentence, Word, Lorem Ipsum Special Foreign Key, Enum, Boolean, Constant -
Configure constraints (optional)
Depending on the type you may set min/max values, allowed enums, a specific format, or mark the field for AI enrichment.
Defining relationships (Foreign Keys)
-
Add a Foreign Key field
Choose Foreign Key as the field type on the child entity (e.g.
customer_idon the Orders entity). -
Select the referenced entity and field
Pick the parent entity (Customers) and the field it references (
customer_id). Synthloom will automatically sample valid IDs from the parent during generation.
Saving your configuration
-
Click the Save button
Enter a configuration name (e.g. Retail v1) and confirm. Configurations are stored in your active workspace and can be reused across multiple generation runs.
-
Export to JSON (optional)
Use the Export JSON button to download the configuration file. You can version-control it in Git or share it with teammates.
Running a Generation Job
The Generate page lets you launch a generation job, configure output options, and monitor progress in real time via WebSocket updates.
Starting a generation
-
Navigate to "Generate"
Click Generate in the left navigation menu.
-
Select a configuration
Choose the saved configuration you created in the previous step from the dropdown. The config name and entity list are shown for confirmation.
-
Choose your output format
Format Best for CSV Spreadsheets, general-purpose tools JSON APIs, document stores, web apps Parquet Big data pipelines, analytics engines -
Configure generation options
Option Description Seed Integer seed for reproducibility. Same seed → identical output every time. Validate RI Run referential integrity checks after generation (recommended). Parallel mode Generate independent entity groups concurrently for speed. Enable AI Pass marked fields through an LLM for realistic contextual text. AI Provider OpenAI or Anthropic (configured in Settings). -
Click "Generate"
The job starts immediately. A real-time progress panel appears showing the current entity being processed, records written, and elapsed time.
Reading the progress panel
While a job runs you can see:
- Current entity — the entity currently being generated
- Records written — live count updated over WebSocket
- Stage — generation level (parallel groups are labelled)
- Elapsed time — wall-clock duration since job start
- Log tail — last few log lines from the backend for debugging
Viewing & Downloading Output
The Output Viewer gives you a searchable, paginated data table for every generated file, plus one-click download options.
-
Navigate to "Output"
After a job completes you are prompted to view the output, or navigate manually via the left menu.
-
Select an output folder
Each generation run produces a timestamped folder. Click a folder to see the files it contains (one file per entity by default).
-
Preview a file
Click any file to load an inline data table. Use the search bar to filter rows, the column selector to hide/show fields, and the pagination controls to navigate large datasets.
-
Download
Click Download to save the file in its original format (CSV, JSON, or Parquet). You can also download all files in the folder as a ZIP.
Validating Your Dataset
The Validation Results page shows a detailed report on the quality of the last generation run — covering referential integrity, uniqueness, value ranges, and custom business-rule compliance.
-
Navigate to "Validation"
Click Validation in the left menu. Results from the most recent job are loaded automatically.
-
Review the summary
A top-level summary card shows the overall pass/fail status and a count of checks performed.
-
Inspect individual checks
Each check is listed with its entity, field, type, and result. Failed checks show the number of violations and example values.
Validation check types
| Check | What it verifies |
|---|---|
| Referential Integrity | Every FK value in child entities exists in the parent entity |
| Uniqueness | Fields marked unique have no duplicate values |
| Range | Numeric fields are within the configured min/max bounds |
| Not Null | Required fields contain no null values |
| Enum | Enum fields only contain allowed values |
Pipeline View
The Pipeline page renders an interactive dependency graph of your data model, showing how entities are ordered for generation and which groups can run in parallel.
-
Navigate to "Pipeline"
Click Pipeline in the left menu.
-
Inspect the dependency graph
Nodes represent entities. Directed edges represent foreign key dependencies. Entities at the same generation level are highlighted together — these run concurrently in parallel mode.
-
Check for circular dependencies
If a circular dependency is detected it will be flagged here in red, preventing generation until resolved in the Design Studio.
Generation History
The History page is an audit trail of every generation job run in the current workspace, including timestamps, configuration used, record counts, and final status.
-
Navigate to "History"
Click History in the left menu.
-
Browse past jobs
Each row shows the job ID, configuration name, start time, duration, total records, output format, and status (Completed / Failed / Cancelled).
-
Jump to output or re-run
Click a job to open its output directly in the Output Viewer, or click Re-run to launch a new job with identical settings.
AI Settings
The Settings page is where you configure AI provider credentials and select the default model used for field enrichment.
Configuring OpenAI
-
Navigate to "Settings"
Click Settings in the left menu.
-
Enter your OpenAI API key
Paste the key (starts with
sk-…) into the OpenAI card. Use the eye icon to verify it before saving. -
Choose a model
Model Best for GPT-4o (recommended) Best quality / speed balance GPT-4o Mini Large volumes at lower cost GPT-4 Turbo Complex contextual generation GPT-3.5 Turbo Fastest, lowest cost -
Click "Test Connection"
Synthloom sends a minimal ping to the API to verify the key is valid before saving.
-
Save
The key is stored encrypted in the workspace settings. A masked hint (
sk-••••1234) confirms the key is active.
Configuring Anthropic Claude
The process mirrors OpenAI. Available Claude models:
| Model | Best for |
|---|---|
| Claude Haiku 4.5 (recommended) | Fast, cost-effective enrichment |
| Claude Opus 4.6 | Highest quality output |
| Claude Sonnet 4.5 | Balanced quality and speed |
Admin Console
The Admin Console is available to administrator accounts only. It provides user management, workspace oversight, and system-level controls.
-
Navigate to "Admin"
The Admin link appears in the navigation only if your account has the admin role.
-
Manage users
View all registered users, their roles, and their active workspaces. You can promote users to admin, reset passwords, or deactivate accounts.
-
Oversee workspaces
See all workspaces across all users — useful for monitoring storage usage and cleaning up orphaned workspaces.
-
System settings
Configure global defaults such as the default AI provider, max parallel workers, and record count limits per workspace.