Techno Blender
Digitally Yours.
Browsing Tag

ManyToOne

Create Many-To-One relationships Between Columns in a Synthetic Table with PySpark UDFs

Leverage some simple equations to generate related columns in test tables.Image generated with DALL-E 3I’ve recently been playing around with Databricks Labs Data Generator to create completely synthetic datasets from scratch. As part of this, I’ve looked at building sales data around different stores, employees, and customers. As such, I wanted to create relationships between the columns I was artificially populating — such as mapping employees and customers to a certain store.Through using PySpark UDFs and a bit of…