Geodemographic Segmentation ... big words, big returns
In today’s rapidly changing business environment, marketers who employ segmentation analysis clearly have the competitive edge. By knowing who their customers are and how best to reach them, companies can both increase the level of satisfaction of their current customers while also expanding their customer base.
Some of the major advantages of choosing a geodemographic segmentation system such as US MOSAIC are:
- Versatility: From customer acquisition and retention to site selection to analyzing market potential, geodemographic segmentation can be used by virtually any customer-based industry to address a host of marketing needs. Once a customer base is profiled and targeted, media planning can be applied across the entire marketing mix including billboards, print advertising, cable, radio, and the Internet.
- Accountability: By periodically measuring the prevalence of certain key clusters in its customer base, a business can track the effectiveness of its marketing strategies over time.
- User Friendly: Far easier to analyze than spreadsheets, colorful, easy–toread maps provide busy executives with a clear snapshot of their key markets.
- Cost Effective: Finely tuned targeting allows marketers to “hone in” on their audience, resulting in less expensive and higher yielding marketing strategies.
Neighborhood classification or segmentation is one of the cornerstones of geodemographic analysis and is used in a wide range of marketing and site location applications, including: neighborhood description, customer analysis, facility planning, advertising, and direct mail. The attractiveness of neighborhood segmentation stems from the analytical performance and inherent simplicity and understandability of the technique. U.S. Mosaic is the latest in a series of neighborhood classification systems built by Experian, whose international lifestyle segmentation experience spans over twenty years and nearly twenty countries. Mosaic combines the best of Experian’s international expertise with the knowledge and experience of some of the most experienced segmentation experts in North America. During the product development and refinement process, Mosaic was compared to other leading U.S. segmentation systems in a wide variety of tests. The results indicated that overall Mosaic performance is either better or equal to other established segmentation systems.
Segmentation Objectives
Neighborhood segmentation systems are classifications of geographic areas according to their demographic, lifestyle, and other attributes. The goal of classification is to define a set of segments that are as different as possible while ensuring that the neighborhoods assigned to each segment are as similar as possible. In many discussions of segmentation, the words “cluster” and "segment" are used interchangeably.
The objectives during the construction of Mosaic were:
- To create a series of segments that provides the most powerful discrimination of consumer behavior, lifestyles, and attitudes.
- To build segments that are as recognizable and meaningful as possible to marketers.
- To ensure that each of the segments contains sufficient numbers of households to be statistically reliable for most analyses.
- To ensure that each segment is homogeneous in terms of demographics and consumer behavior.
- To avoid an excessive concentration of individual segments within particular geographic regions, except where appropriate.
Methodology
The original formulation of Mosaic was built using 1990 Census demographics at the block group level of geography, which are typically neighborhoods of approximately 500 households. Over two hundred and twenty five thousand such block groups were defined in the 1990 Census, the vast majority of which were included in the analysis to develop the MOSAIC classification. Block groups that did not meet a minimum threshold household count were excluded from the analysis.
The variables chosen for consideration in the model were selected based on several factors and objectives:
- Variables should represent different demographic categories, avoiding overrepresentation of any single category. For example, a system based excessively on income characteristics would fail to discriminate neighborhoods on other characteristics such as household type, size, and age.
- Sets of variables that are highly correlated, or closely related, were avoided by selecting only the most predictive variables from these sets.
- Variables should correlate well with consumer behavior.
- Variables should have sufficient sample size to be statistically valid.
- Variables should not be heavily concentrated in a small number of geographic areas.
The variables used in the development of Mosaic reflect a balance of the range of factors that affect consumer behavior, expenditures, and attitudes.
Unlike other systems that rely on data reduction techniques such as factor analysis, the methodology employed allowed each individual variable in its raw form to influence the cluster code given to a particular Block Group. A unique variable weighting facility was used that allowed different levels of influence to be assigned to different variables. This facility is used to weight more heavily the influence of highly predictive variables and variables from categories whose availability is poor. Likewise, over represented categories can be weighted less heavily.
The cluster algorithm used to build Mosaic is known as “iterative relocation” and is based on statistical similarity measures based on least squared differences. Prior to clustering, the variables used are standardized on the basis of means and standard deviations in order to eliminate the effects of measurement scale on the results (e.g. dollars versus percentages). Based on random starting points, proportional to population, the algorithm assigns each neighborhood to the most similar cluster. The average scores of each cluster are recalculated on each input variable, after which neighborhoods are assigned to new clusters where a better fit is achieved. The process is repeated over many cycles in order to optimize the resulting classification. When complete, the final set of neighborhood segments are as different as possible across the input variables and within each segment, the neighborhoods are as similar as possible. Each of the resulting 62 segments was then classified into groups of segments in order to provide a simple hierarchical structure.



