Data Management Platforms: An Empirical Taxonomy

Data management platforms (DMPs) are a widely used means of placing targeted advertising, for example, commercial or political advertisements. However, only a few academic papers shed light on the platforms’ mechanisms. These mechanisms’ opacity makes it hard for consumers to understand what happens with their data, and regulators struggle to implement effective regulations. Hence, we develop a taxonomy to understand and compare different characteristics of DMPs. Following Nickerson’s (2013) method and combining an inductive and deductive approach, eight dimensions emerge that differentiate DMPs. We evaluate the taxonomy’s applicability and test it with a set of nine DMPs, which we select by feasibility, relevance, and popularity. The application shows that the eight dimensions cover the significant features that explain most of the variance in characteristics between DMPs. The evaluation revealed opportunities for further development of the taxonomy.


34TH BLED ECONFERENCE
Digital Support From Crisis to Progressive Change 3 in this study, we focus on the functions performed by DMPs and put forward the following research question: RQ: How can we distinguish the functionalities of data management platforms?What are the discriminating characteristics?
We address this question by developing a taxonomy that provides an overview and comparison of DMPs.The taxonomy is developed following the Nickerson et al. (2013) approach using an iterative process that combines inductive and deductive phases.It is evaluated by applying it to nine DMPs.Our taxonomy contributes to a better understanding of DMPs, their functionalities, and their mechanisms.Investigating the characteristics of DMPs (e.g., availability of anonymisation functions, real-time processing) contributes knowledge to the debate on targeted advertising and privacy with implications for future regulations.

Data Management Platforms in Real-Time Bidding Systems
DMPs are most prominent in the real-time bidding process (see Figure 1), in which there are two main stakeholders: the advertisers and website operators.To realise real-time targeted advertising, ad-networks take offers of advertising slots from supply-side platforms and match these with bids for advertising slots from demandside platforms.For the matching process, demand-side platforms use real-time processed data from DMPs (e.g., consumer segments), predefined selection criteria, empirical values, and publishers' predictions about whether the advertising space will be worthwhile (Dawson, 2014;Zhang et al., 2014).If all conditions are met, the deal is closed on the ad-network, and the supply-side platform places the advertiser's advertisement in the advertising space (Wang et al., 2017).A DMP (usually a separate company from the advertising network) collects, stores, manages, and processes the relevant data.Effective coordination between these multiple actors -ad-network, demand-and supply-side platforms, and a DMP -facilitates targeted advertising for defined consumer segments in near real-time.The information from the data or the data itself might be traded on data marketplaces (Lange et al., 2018).While critics contend that arrangements like this can violate consumers' privacy, proponents contend that they serve consumers by presenting them with offers likely to appeal to them.
34TH BLED ECONFERENCE Digital Support From Crisis to Progressive Change

Research Method
To develop a taxonomy of DMPs that allows for their differentiation, we followed the method proposed by Nickerson et al. (2013).The method follows an iterative process through which the taxonomy is gradually built by adding new characteristics.
Initially, we curated a longlist of 48 DMPs using internet sources such as articles or reports from auditing companies (e.g., Andrew et al., 2020;Moffett & Chien, 2019).
In our iterations to develop the taxonomy, we derived the characteristics deductively from the literature (Kamps & Schetter, 2018;Wang et al., 2017) and inductively from existing DMPs (Adobe, 2020;Google, 2020;Lotame, 2020;Oracle, 2020b).The iterative process terminated after six iterations because all specified subjective and objective termination conditions were met.With regards to objective termination conditions, this means that (1) no dimension or characteristic was merged or split in the last iteration, (2) each dimension was unique and did not repeat, and (3) each characteristic was unique within its dimension.The five subjective conditions used to evaluate the taxonomy's quality after each iteration were conciseness, robustness, extensibility, comprehensiveness, and explainability (Nickerson et al., 2013) (see appendix).In the end, the resulting taxonomy was discussed and jointly evaluated by the three authors.The taxonomy's evaluation was performed using nine DMPs from the longlist (see appendix, Table 2).We selected these nine DMPs because they met our requirements in terms of 1) feasibility (amount of information and trial version availability), 2) relevance (assessing the DMP revenue), and 3) popularity (number of clients and awards, which illustrates the influence and reach of the DMPs that can affect people).For each of the nine DMPs, we gathered the necessary information to apply the taxonomy, relying primarily on official information sources, including the DMPs' public websites and trial versions if available.In case of incomplete and missing information on the website, we contacted the support via phone, email, or contact form on their website.

Taxonomy
The eight main dimensions are data import, generable data, data sources, webtracking, data processing functions, external data sources, data export, and data security (see Table 1).Data import distinguishes three different data types, which can be imported in a DMP.First-party data (FPD) is obtained by companies through previous consumer contact, website visits (Kreutzer, 2018), or directly from the consumer.The latter includes browsing behaviours or socio-demographic data such as gender or age (Cederholm & Simpson, 2018;Kamps & Schetter, 2018).
Second-party data (SPD) is first-party data of an external company acquired through direct partnerships (Kamps & Schetter, 2018).Third-party data (TPD) is anonymised data provided by data resellers or data marketplaces.
Generable data distinguishes four types of data that DMPs can generate.One is first-party data collected by DMPs with the help of tracking mechanisms, for example, collecting it from internal customer management systems.Apart from firstparty data, DMPs can identify consumer segments based on offline and online data analyses.The granularity, that is, the level of detail in the consumer segment, may vary.
Look-a-like user profiles are generated by DMPs using existing first-party data as a basis.Look-a-like profiles are identified user profiles similar to a company's consumer profiles in terms of indicators such as age, interests, or hobbies.Machine learning is often integrated into the look-a-like modeling process and increases the number of consumers that can be reached with the advertising campaign.DMPs can also determine unique user profiles and track them across websites by using internal and external data sources.

34TH BLED ECONFERENCE Digital Support From Crisis to Progressive Change
The use of cookies is an often mentioned web-tracking method of DMPs.While some DMPs track data without using cookies (e.g., fingerprinting), they do not explicitly mention which alternative methods they use.We refer to them as other methods in the taxonomy.Data sources from which DMPs collect data originate from a company's website directly, its apps, or social networks.
Real-time analysis facilitates the immediate analysis and processing of data, enabling quick decision-making.Data anonymisation functions, segmentation of data into consumer segments, and demand-side platform test functions are common.DMPs' demand-side test functions are useful when operating with multiple external demand-side platforms to identify the appropriate one for the existing DMP.A wrong choice can harm the advertising campaign to a potential 20% to 40% loss in user profile reach (Joe, 2014).Machine-learning algorithms are used to extract meaningful information from unstructured data, such as product popularity.Waste management removes records in the DMP not relevant for further processing.
Integrating external data sources is crucial for some processes, such as the engendering of look-a-like data or the refinement of existing user profiles.In this regard, DMPs either provide a platform themselves with with interfaces to thirdparty platforms through which data can be acquired.Besides, some DMPs integrate partner exchange platforms, which enable companies to establish connections to partners with whom they want to exchange anonymised data.The advantage of this is that the origin of the data is known.Knowing the data source adds transparency and trust on the one hand, and on the other hand, lets people evaluate the quality of the data that originates from the respective source.
Usually, DMPs offer a data export option to integrate with interfaces from customer relationship management software, demand-side platforms, supply-side platforms, and ad-networks/ad-exchanges.Some DMPs provide an export option to data marketplaces on which collected data can be sold to external parties.Apart from specific data export options, the export into a generic local file (e.g., a consumer data feed) enables companies to use the exported data for various purposes such as business intelligence tools (Chou et al., 2005).Generally, not only raw data but also processed data can be transferred to other platforms (Kamps & Schetter, 2018).The dimension of data security focuses on General Data Protection Regulation (GDPR) compliance, data security, and consent management.The GDPR set in place in 2016 by the European Parliament is a set of regulations concerning the processing of personal data in Europe (Art.1-99 DSGVO).Because the GDPR (Art.6 Abs. 1 and Art.7) envisages that companies need users' voluntary consent before collecting their data, consent management is often integrated into DMPs.
We evaluated our taxonomy by analysing and categorising nine DMPs, showing that the eight dimensions cover the significant features that explain most of the variance in the taxonomy's characteristics across the DMPs.The evaluation results are presented in Table 1.Through the evaluation, we identified opportunities for extending the criteria catalogue, such as with a cost and usability dimension.

Discussion
The application of the taxonomy showed that DMPs usually involve first-party and third-party data segmentation in real-time, sometimes with the use of machine learning, following the goal of consumer segmentation.Consumer segmentation makes it possible to implement marketing objectives such as serving targeted advertisements, displaying targeted content, political microtargeting, or personalised price discrimination (Badmaeva & Hüllmann, 2019;Klein & Hüllmann, 2018).It can be used as a foundation for placing advertisements or generating look-a-like data (Kamps & Schetter, 2018).The importance of clustering consumer data into segments in the real-time bidding process is reflected in the evaluation of our taxonomy.Data segmentation is the only data processing functionality that all nine evaluated DMPs provide.
Consumer segmentation and the subsequent placement of advertisements have two data-related success factors: data quality and data quantity.First, if the data quality is bad, individuals might have been added to the wrong cluster because the data is inaccurate or inconsistent.Second, the more data is available, the more fine-grained consumer profiles can become, ultimately increasing quality.In light of various webtracking methods and data export options to other actors in the real-time bidding system, critics argue that DMPs operate in contrast to the individual's data privacy.However, our study shows that while DMPs are not necessarily required to comply 34TH BLED ECONFERENCE Digital Support From Crisis to Progressive Change 9 with the GDPR in their origin countries, most of them do comply with the GDPR because of their European customers who are subject to the GDPR.The DMP's functionalities supplement the functionalities of other actors in the realtime bidding system, such as supply-, demand-side platforms, ad-networks, and data marketplaces.The option of publishers and/or advertisers to export data to demand-and supply-side platforms makes the real-time bidding process more efficient.Real-time analysis is an essential characteristic because it enables advertisers (or advertisers' supply-side platform) to determine in real-time, before the advertisement is placed, whether a publisher's page is related to the advertisement's content.In that way, advertisements can be placed in advertising spaces where consumers of the target group interact.Another advantage is that placing advertisements on unsuitable or reputation-damaging websites can be avoided (Kreutzer, 2018;Zawadzk & Groth, 2014).Without effective data management, the automated and real-time matching of advertisements and advertising spaces on websites and the realisation of targeted advertisement would hardly be possible.These targeted advertisments help to optimise the reach of advertising campaigns (Yuan et al., 2012), which is why DMPs are valued in practice.The performance of targeted advertising in real-time bidding systems depends on the quality and quantity of the used data.A scenario can occur in which not enough data is available for valuable insights.For such cases, DMPs integrate with data marketplaces to provide publishers and/or advertisers the opportunity to buy and integrate external data, enriching subsequent analyses by creating detailed digital consumer profiles or refining existing user data.
Through their characteristic functionalities, DMPs enable effective interaction among different platforms.First-party and third-party data import into a DMP and various export options are exemplary functionalities that facilitate cooperation among actors in the real-time bidding system.For those actors, even though some are direct competitors, the value of cooperating is more significant than not cooperating.Cooperating allows filling gaps in missing functionalities or an expanded portfolio of multiple demand-side platforms to optimize the output of the advertising campaign.Salesforce is, for example, originally known for its customer relationship software but expands its service by collaborating with Google, allowing its customers to use their customer relationship management data with Google Analytics to perform data analyses (Google & Salesforce, 2020).Further, Salesforce 34TH BLED ECONFERENCE Digital Support From Crisis to Progressive Change customers can use Google's tracking methods to obtain more data and integrate it into the Salesforce DMP.Cooperating is additionally beneficial for DMPs if they want to offer a more extensive portfolio of platforms (e.g., cooperating with multiple demand-side platforms that offer a DMP themselves).A DMP that integrates different and multiple demand-side platforms has the advantage of reaching a broader range of users, as different DMPs can cooperate with different ad-networks or ad-exchanges, and provide customers with multiple options between different demand-side platforms (Oracle, 2020a;The Trade Desk, 2020).
In the end, DMPs can be seen as a double-edged sword.On the one side, they provide indirect value to consumers as they support the placement of targeted advertisements and thereby ensure that consumers only see the advertisements that are relevant to them.On the other side, DMPs can contribute to the efficient distribution of harmful distorted or fake content to consumer segments because placed advertisements during the real-time bidding process are not evaluated.In that way, misleading content or fake news can be spread.Creating look-a-like data makes this approach scalable by extending existing consumer segments with consumers that have similar profiles.

Conclusions, Limitations, and Outlook
With this study we contribute a taxonomy that helps to understand and distinguish DMP functionalities.The taxonomy establishes a common ground for discussions on implications, for example, with regards to data privacy, data security, and the manipulation of opinions and consumer behaviour.It further helps to grasp the DMP's role in real-time bidding systems.Through an inductive and deductive procedure, eight taxonomy dimensions were identified that cover the main functionalities and mechanisms of DMPs.To ensure the reliability and validity of our taxonomy, we evaluated it with nine selected DMPs.Our developed taxonomy and its application clarify which characteristics define a DMP and guide discussions about specific functions and mechanisms of DMPs.The comparison of different DMPs in course of the evaluation is additionally helpful for practitioners when choosing a particular DMP for adoption.
As a limitation, we note that the developed taxonomy categories and characteristics undergo change as data management platforms undergo change.For example, new 34TH BLED ECONFERENCE Digital Support From Crisis to Progressive Change 11 legal regulations may be set in place over time, or new data sources emerge.In that sense, the taxonomy is only valid until change happens in data management platforms and new characteristics and dimensions emerge.Therefore, future research should investigate the changes that DMPs go through to update the proposed taxonomy.Besides this limitation, future research opportunities are available that extend and refine the taxonomy.We can specifically think of webtracking, that would benefit from a refinement.In our study, we lacked information on the application of other tracking methods as well as which cookie types are used by DMPs.Altogether our study provides first insights into the functionalities and mechanisms and thereby the role of DMPs in the RTBS.Yet, more research is needed in this field and the meaning of remaining actors needs to be studied to fully understand real-time bidding systems and their implications.

Table 1 :
Taxonomy on DMPs and Evaluation.A = Adobe, G = Google, L = Lotame, M = MediaMath, Ne = Neustar, Ni = Nielson, O = Oracle, S = Salesforce, T = The Trade Desk.Sources referenced by footnotes were used to establish taxonomy dimensions and characteristics, while the "x" in the evaluation reference DMPs that possess the specific characteristic.Data A,B,C,D,E,G x x x x x x x x Consumer Segments A,B,C,D,E,G x x x x x x x x x Look-a-like Data A,B,D,E x x x x x x x x x User Identification A,B,D,E,G x x x x x x x x Data Sources C,D,F,G Apps A,B,D,E x x x x x x x x A , 2020; Google B , 2020; Kamps & Schetter C , 2018; Lotame D , 2020; Oracle E , 2020; Schonschek F , 2020; Wang et al.G , 2017 34TH BLED ECONFERENCE Digital Support From Crisis to Progressive Change