Austin: The Next High-Growth Market

A common practice in the investment world is to build mental models in order to facilitate pattern recognition. Within the equities space, we see this when investors draw corollaries between stocks, such as Stock [X] being the next Apple or Stock [Y] being the next Facebook. Within the venture capital space we see this when startups are deemed the Warby Parker of [X] or the Uber of [Y]. Similarly, within the real estate space, the goal may be to identify the next Austin, or the next up-and-coming market.

In the absence of other data points, the traditional approach to market selection typically consists of analyzing fundamental factors such as demographics, employment, and supply-demand trends. While useful to some degree, this unfortunately produces a restricted view of the world and relies on parameterizing an entire city with just a few data points. Importantly, the traditional approach tends to ignore what’s happening at the micro level, including important patterns such as new business formation and trends which quite frankly can easily be overlooked by the human eye. The question then is: How can we leverage machine learning and alternative data to produce a more holistic view of a market?

Borrowing from Natural Language Processing

In order to leverage machine learning, we need to properly model the features associated with geographic representations. We need the ability to quantify what it means to have three new coffee shops open on the same block, the addition of a Michelin-starred restaurant, or the opening of a new school.

The idea of taking something that’s not easy to model in a standard way, and applying structure to it, is something that’s common within Natural Language Processing (NLP). Take, for instance, text, which is inherently messy to process, as the intent is more than just the words on the paper — there is a structure within the sentence, a structure within a page, and structure across paragraphs.

Within NLP, a simple yet powerful technique for building a numeric representation for a body of text is a bag-of-words model. Within this model, the numeric representation is calculated by counting the number of occurrences of a given word and producing a vector representation of text.


While this model has its shortcomings in terms of capturing more nuanced relationships of writing, it does capture larger relationships that allow us to make broader comparisons. For example, are two books written by the same author? Or are two books of the same genre?

Generating neighborhood vectors

We can take a similar approach to modeling a zip code, where the vector representation of the zip code is just the collection of the points of interest that reside within that zip code. If we think of a zip code as the total number of schools, coffee shops, bars, daycares, and libraries, we can then use the vector representation to effectively compare one geographic region to another.


Leveraging our partnership with Foursquare, we start with a curated dataset of real-time business formation, which gives us unique insight into the vitality of a neighborhood. Furthermore, this data set provides us with rich location attribution providing a number of insights such as foot traffic at the location, hours of the venue, as well as detailed venue categorizations.

The first step in producing a zip code vector is generating a monthly roll-up of the data over 12 years, aggregated by more than 200 categories. Encoding a time dimension into the zip code vector gives us two advantages over standard feature encoding. First, it allows us to measure the rate of change within a neighborhood, which provides insight into the dynamism of the region. Second, and more importantly, it allows us to compare geographic regions at two different points of time.

The second step is to scale the data across neighborhoods to ensure we weigh our features correctly. Within NLP, a commonly used technique is Term Frequency Inverse Document Frequency (TFIDF). The rationale behind this technique is we want to ensure that we highlight meaningful words rather than just assuming words that are common are meaningful. Similarly, with describing a zip code, the introduction of something like a Michelin-starred restaurant or WholeFoods should be weighed more than something more common across zip codes like a convenience store or gas station.

This ability to numerically describe a zip code at a given point of time is what allows us to determine, for instance, which markets today look like Austin did in 2014.

Venue Name Address Postal Code Foot Traffic Score Category Category 2
Corner Pub xxxx 37203 .44444 Bar
House Inn xxx 37203 .45123 Hotel
Jack's BBQ xxx 37203 .451223 Barbecue
A House xx 37203 .8999 Speakeasy Lounge
Pizza Pub xxxx 43831 .68390 Pizzeria Bar
Great Cuts xxx 43831 .50023 Hairdresser
Leo's xxx 43831 .60013 Diner
Olga's Kitce xx 43831 .89991 Greek Restaurant Lounge
Date Postal Code # Bar # Pizzeria # Barbecue # Greek Restaurant # Speak Easy
2018/01/01 48331 10 4 1 1 1
2019/01/01 48331 6 6 2 1 0
2017/01/01 37203 3 2 3 0 1
2019/11/01 37203 10 3 6 0 3

Comparing neighborhoods

Now that we have a vector representation of zip codes, we can leverage this data set to identify markets that exhibit similar characteristics in terms of business formation. To do this, we need to measure the distance between our target vector and input vector, where the target vector corresponds to the neighborhood with our desired characteristics—for instance, Austin in 2014.

To prove that this is a viable approach to describing and comparing zip codes, let’s look at the following examples: the neighborhoods of Hoboken, New Jersey and Birmingham, Michigan. Each of these cities represents two very different geographic regions, one with suburban characteristics and the other more urban.

Using Cosine similarity as a metric of distance, we see that zip codes which our model found to be most similar to Birmingham, Michigan, are more affluent neighborhoods where the median home value, household income, and educational attainment are well above the national average.

City/MSA Zip Code Similarity Rank Median Home Value Median Household Income % of Population with a Bachelor's Degree or Higher
Birmingham, MI 48009 - $492,100.00 $139,160.00 73.93%
Oklahoma City, OK 73116 1 $221,500.00 $92,449.00 57.04%
Brookfield, WI 53005 2 $310,700.00 $100,828.00 55.46%
Houston, TX 77057 3 $244,400.00 $97,476.00 51.80%
Houston, TX 77098 4 $563,500.00 $139,474.00 78.45%
Alpharetta, GA 30009 5 $362,200.00 $109,342.00 51.44%

If we run the same analysis on Hoboken, New Jersey, the model finds zip codes that are most similar to be within more urban areas such as San Francisco, Chicago and Brooklyn.

City/MSA Zip Code Similarity Rank Median Home Value Median Household Income % of Population with a Bachelor's Degree or Higher
Hoboken, NJ 07030 - $703,000 $197,100 74.30%
Old Town Chicago, IL 60610 1 $295,600 $129,925 68.52%
Polk/Russian Hill San Francisco, CA 94109 2 $1,190,000 $146,532 63.59%
Prospect Heights Brooklyn, NY 11238 3 $965,400 $117,375 58.08%
Iowa City, IA 52240 4 $215,700 $79,597 34.57%
Minneapolis, MN 55408 5 $283,900 $87,500 49.74%

Now that we’ve established that our model can accurately describe a zip code numerically and we can then use this numerical definition to compare the zip code’s likeness to other regions within the US, we can leverage our model to identity markets poised for growth.

Finding the next Austin

It’s no secret that commercial real estate in Austin has performed extremely well over the past cycle. Using our model, how can we identify markets that exhibit similar characteristics to Austin in 2014?

Leveraging our zip code vectors, we can simply aggregate them to the county level. In this case, we’ll look for counties in 2019 with similarities to Travis County, TX, which houses our target Austin.

City Similarity Rank
Nashville, Tennessee 1
Indianapolis, Indiana 2
Raleigh, North Carolina 3
Charlotte, North Carolina 4
Houston, Texas 5
Dallas, Texas 6

Here, we see that Nashville comes closest to resembling what Austin looked like in 2014. In other words, as ranked by our model, Nashville looks best-positioned to be the next Austin, a view we then reinforce with a more traditional analysis of the Nashville market to more comprehensively understand the opportunity.

(Read: Market Spotlight: Nashville)

What does this mean?

At Cadre, we’re focused on understanding the complete picture. The importance of incorporating non-traditional data into one’s investment thesis cannot be understated. While we’re currently focused on market level analysis, continued investment will allow us conduct this analysis at even lower granularities. All said, this approach to understanding the trajectory of a market only provides part of the picture, and should be paired with adequate fundamental analysis for a holistic view of the market.

Cadre provides accredited investors with direct access to institutionally-underwritten and data-driven commercial real estate investment opportunities. To get started, please request access to the platform.


Educational Communication

The views expressed above are presented only for educational and informational purposes and are subject to change in the future. No specific securities or services are being promoted or offered herein.

Not Advice

This communication is not to be construed as investment, tax, or legal advice in relation to the relevant subject matter; investors must seek their own legal or other professional advice.

Performance Not Guaranteed

Past performance is no guarantee of future results. Any historical returns, expected returns, or probability projections are not guaranteed and may not reflect actual future performance.

Risk of Loss

All securities involve a high degree of risk and may result in partial or total loss of your investment.

Liquidity Not Guaranteed

Investments offered by Cadre are illiquid and there is never any guarantee that you will be able to exit your investments on the Secondary Market or at what price an exit (if any) will be achieved.

Not a Public Exchange

The Cadre Secondary Market is NOT a stock exchange or public securities exchange, there is no guarantee of liquidity and no guarantee that the Cadre Secondary Market will continue to operate or remain available to investors.

Opportunity Zones Disclosure

Any discussion regarding “Opportunity Zones” ⁠— including the viability of recycling proceeds from a sale or buyout ⁠— is based on advice received regarding the interpretation of provisions of the Tax Cut and Jobs Act of 2017 (the “Jobs Act”) and relevant guidances, including, among other things, two sets of proposed regulations and the final regulations issued by the IRS and Treasury Department in December of 2019. A number of unanswered questions still exist and various uncertainties remain as to the interpretation of the Jobs Act and the rules related to Opportunity Zones investments. We cannot predict what impact, if any, additional guidance, including future legislation, administrative rulings, or court decisions will have and there is risk that any investment marketed as an Opportunity Zone investment will not qualify for, and investors will not realize the benefits they expect from, an Opportunity Zone investment. We also cannot guarantee any specific benefit or outcome of any investment made in reliance upon the above.

Cadre makes no representations, express or implied, regarding the accuracy or completeness of this information, and the reader accepts all risks in relying on the above information for any purpose whatsoever. Any actual transactions described herein are for illustrative purposes only and, unless otherwise stated in the presentation, are presented as of underwriting and may not be indicative of actual performance. Transactions presented may have been selected based on a number of factors such as asset type, geography, or transaction date, among others. Certain information presented or relied upon in this presentation may have been obtained from third-party sources believed to be reliable, however, we do not guarantee the accuracy, completeness or fairness of the information presented.