Skip to main content

RETL vs SQL Traits


Since RETL was released I am debating whether or not RETL is a replacement for SQL Traits. What you think?

Admittedly, I haven’t set up RETL yet but this is my interpretation after reading through the docs back when it was released. I’m sure there are some added benefits of moving to RETL but I haven’t quite made the leap.

Going to follow this thread to see if there is more context from others who have established RETL.


Got it! Thanks for sharing Corey :) 


Hey Andrew - I’m on the Team at Segment that uses Segment, and we moved almost all of our SQL Traits to rETL in the last year. Reasons being: we were hitting the 100 column limit, rETL is generally more robust (less risk of someone changing something accidentally), and we also wanted to send some of these traits to destinations directly (along with using them in Engage for audiencing). It was a bit on a pain to move them initially, but we were also using the early beta version of rETL - it would be easier now.


SQL Traits is no longer for sale, so I believe it’s being superseded by RETL.

https://segment.com/docs/unify/Traits/sql-traits/

RETL is a much more flexible and robust feature than SQL Traits and I am working to migrate from SQL Traits to RETL.


We make a lot of use of the rETL functionalities. Not only to create traits on a person, but also to create track events. 


We still have some SQL traits performing their job quite well, but for all new traits, we are adopting rETL. In my opinion, SQL traits are easier to set up and use, but rETL offers more flexibility.


The more I use Reverse ETL the more I like it :)


I went through this exercise and converted all our SQL Traits to RETL and these were my findings:

  • Watch cost.  Our Snowflake compute credits doubled (we will definitely max out on RETL) at the initial shift.
  • Understand how many models and how many destinations for each model.  Each destination for the model will contribute to # of records for your RETL limit.  So you could run three models or one large model… it becomes a numbers game
  • Compute credits are gold as well.  Freeing up some SQL traits allowed us to add more computed traits
  • RETL traits sent in via Segment Profiles are custom traits.  Unless they were sent in via SQL, then they stay SQL traits.  Little quirk to remember when building audiences and journeys.

All in all, through testing and many iterations over a few weeks, I ended on a hybrid of SQL &  RETL.  Traits with long sync times or many changes with 1 destination, I moved back to SQL.  Queries with many destinations, that needed faster sync times, stayed in RETL.  Best of both worlds.


We have about 60+ audiences built around SQL traits at the moment - we are actively migrating all our sql traits over to reverseETL instead. 

We’re doing this for these key reasons:
1 - Operationalise data we have in our warehouse to organise customer data into logical entities. Eg. User, Account, Product, Comms Preferences, Consent etc. We Use identity resolution rules to ensure that these entities have relationships that match what we have setup in Segment.
2 - Use reverseETL mapped to → Identify and Group events to enrich User and Account profiles from systems of record that don’t have telemetry events going through Segment sources. 
3 -Our strategic roadmap includes milestones for achieving automated 1:1 personalisation at scale. We are adopting “linked audiences” and use these entities in the warehouse to pull in contextual data at the time of activation to achieve this.

Key learnings so far from RETL migration:
 -As mentioned by @Julie S  - consider frequency of RETL syncs (this affects your data freshness obviously, but also adds up in your RETL usage - also consider cost of processing in the warehouse ( query and storage)
- Compute credits are being phased out in place of unit limits per feature. Eg. if you have 500m API events per month in your agreement, then you get X computed traits and audiences per annum, Y journey steps per annum and Z RETL events per annum.
- Similar to audiences and traits in identify and group events from your sources, you cannot delete custom traits against a profile once it’s on the profile. This also applies to traits sent into segment by RETL. What makes this particularly risky is that there’s no “Tracking plan” capability or quality gate in this process to prevent typos and data quality issues. I highly recommend setting up a new space and test your RETL out first before you setup a RETL into your production environment.
- On the same thread as the point above, consider quality processes in the warehouse to prevent mass updates to RETL jobs that will have wide ranging effects on your profile data, downstream audiences, journeys, and destinations. 
We use snowflake and dbt in our data warehouse. We ensure that the data we use in the RETL is updated through controlled processes that enforce data quality, governance compliance, alerting, monitoring and access control etc.

Any questions - reach out!
​​​​​​​
​​​​​​​
 


Thanks for sharing your key learnings on rETL @Peter Yanny, very helpful insight!


Reply