About
In this webinar we cover the basics of what a data warehouse is and how data warehouses integrate with Segment.
--
Q&A recap
Q: With a warehouse destination, can I pick what kind of events I want to send, like only sending page views for example?
That's exactly right. With the warehouse destination, there's a setting within the destination itself called selective sync. And from there, you can choose which sources you wanna have sent into the data warehouse, what particular events or tables you wanna have represented there. It even gets as granular as the individual properties that you’d like to have sent.
Q: Can we use Segment to pass data from one database to another i.e. batch data?
You do have the ability to to set up two individual warehouses in your workspace and then connect whichever sources you want. So in that way, you'd be taking individual source data and pushing it to two separate destinations. But it would be the same event information. So that would be one way of doing it. If there's a reason where you have to go from data warehouse to data warehouse, we’d want to understand that a little bit more of the reasoning.
Q: Do you have the ability to store previous events? Like a few months of previous events triggers before adding the warehouse destination?
Yes. You do have the ability to do something like a replay (historic resending of events that were seen by a source). Small caveat, the source has to have seen the event before it can replay that downstream into another tool like a warehouse. So just something to be aware of. But, yes, you can see previously seen events from a source and send that into your warehouse. Suggest working with your Segment account team to better understand what information needs to be replayed.
Q: Is there a limit of time for the replay?
There isn’t a limit, it’s really as far back as the source has been created. When you do the replay, the configurable things that are a part of it are the source that saw the event, the destination it needs to go to, and then the time frame of the events that need to be sent.
There are some advanced filters that can be added if there's, like, a specific event that only needs to be sent from that source. But this is pretty use case specific.
Q: When using a Postgres destination, will Segment create one schema for each selected source?
Yes. So by default, it will create a table for each source. And then, from there, it will kind of waterfall the hierarchy, from the warehouse schemas. And also the events that have been created and properties as well.
Q: If we already have data warehouses connected to a data warehouse, what benefit do we have of getting the same data from Segment?
This is a reverse ETL question, but one potential use would be if you're using Segment alongside Connections, alongside Engage, you may want to pull some of that data out and play it into an Engage/Unify destination.For those familiar with Unify and Engage, you select what kind of sources you want to have identity resolution applied to and if there's data that perhaps isn't representative in that space, you could reference the warehouse to do that.
The other use case for potentially leveraging reverse ETL for existing Segment data that lives in your warehouse could be to send that to another reverse ETL destination that maybe doesn't have access to that data.
Q: For enriching Segment with historical data is Reverse ETL the go to way to do it?
I would say so for sure. A lot of our current and new customers are leveraging reverse CTL to do that enrichment.
So, yeah, it's definitely a good path to choose.
Q: Can you speak to using reverse ETL to turn data in our warehouse into events? Eventizing data…
Thinking of a warehouse, you can kind of think of it as almost like an Excel spreadsheet…and if you have individual records of a user or an event that are logged in each of those rows, those rows can be turned into an eventized stream of data using reverse ETL.
That stream of data will be represented as a Segment Connection source, that's a destination option that you have. And those records will come through as an identify call for a user potentially or a track call for an appointment created. And those events can then be distributed, downstream to other tools, or used for user enrichment, like, we kind of alluded to earlier.
Additional Resources
Choosing the right data warehouse
Webinar recording: https://segmentio.wistia.com/medias/6zdkqbzp68