Project Cascade: Tracking Content From Inception Thru Dissemination

Cascade allows for precise analysis of the structures which underly sharing activity on the web.

This first-of-its-kind tool links browsing behavior on a site to sharing activity to construct a detailed picture of how information propagates through the social media space. While initially applied to New York Times stories and information, the tool and its underlying logic may be applied to any publisher or brand interested in understanding how its messages are shared.


This is absolutely fascinating, and shines a little light on the social loopholes in the NYTimes paywall.

Getting Nationwide MLS Listing Data

When it comes to building real estate websites, listings are kind of important. Just kind of. That is true whether you are an agent/broker trying to be relevant to buyers & sellers in your area or a serial entrepreneur trying to build the next national real estate portal with your own unique twist. I recently got a private question from someone on Quora asking how to get comprehensive MLS listings, so I thought I’d answer it publicly rather than privately.

Adding listings to an agent or broker website in one specific market is a known process – just sign up for IDX, hook it to your website, and away you go.

But for entrepreneurs looking to get nationwide listing inventory — well, it’s not so simple. For those of you in this situation, there are three primary options to consider:

  1. Direct from the source (agents and brokers) – . This is the route companies like Zillow and Trulia took in order to give them maximum long term flexibility. But it’s extremely time intensive and relationship heavy. You’ve got to win over the likes of ERA, Prudential, Weichert, and about a thousand other medium and large sized brokerages and convince them syndicating their listings to you is  a good idea. Or you can spend a boatload of money trying to reach 500,000 agents individually. Whichever route you take, you’ve got to be committed to the effort over the long run and put in the time to form real relationships with key stakeholders at a variety of organizations.
  2. Aggregate MLS feeds across the county – This route still requires that you aggregate hundreds of MLS feeds (there are roughly 900 MLS’) to get comprehensive. Plus, you’ll have to have a sponsoring real estate agent/broker in each market. Additionally, if you aggregate MLS feeds, you are bound by MLS rules that vary from MLS to MLS (making building your national site a pain in the rear).
  3. Use ListHub or Point2 – this will probably get you the greatest number of listings in the quickest amount of time. But it’s still not going to result in comprehensive coverage across the United States. Not all brokers/agents use one of those two syndication partners.

So, in short, there is no quick way to achieving comprehensive listing inventory around the country in a timely manner. Unfortunately for serial entrepreneurs, but fortunately for the Zillow’s of the world who have a considerable head start (they’ve been working on it since 2007), if you start now – you don’t really have a chance at having comprehensive listings within the next 2 years. Unless you want to pay a LOT of money to agents and brokers to get them.

Regardless of which route you take, you’ll have to build a XML import system that can handle multiple XML feeds and de-dupe listings that come from more than one source simultaneously. Hope this helps clarify that whole (non-existent) “nationwide listings data” thing.

If anyone else reading has alternatives, by all means, leave them in the comments.


On Syndication: Is A MLS A Data Repository, or An Exchange?

This is an exchange, that happens to throw off data

A current discussion within the MLS and tech vendor industry is around the issue of listing syndication. This post by Brian Larson, and the discussions therein, is a pretty good summation of the thinking on the part of MLS executives, vendors, and consultants. As Victor Lund of the WAV Group, a leader in the world of MLS consulting, notes in the comments:

Syndication is absolutely a nightmare on many levels – the control of the data quality is gone – leaving behind dregs like duplicate data, false data, reproductized data, resold data, loss of ownership by brokers, loss of copyright by MLSs, reduction in the quality of curated listing content – yadda, yadda.

For what it’s worth, I agree with Victor 100%… if the MLS is a data collections company, like say NPD Group which collects retail data from thousands of point-of-sales systems. Then the practice of syndication is a nightmare, and a disaster.

I believe, however, that there is a real question as to whether the subscribers to the MLS, the brokers and agents who actually create the data that constitutes the valuable intellectual property at question, see things that way. Most working real estate brokers and agents I know think of the MLS as a way to advertise properties for sale (let’s stick strictly with listing brokers/agents for now). I don’t believe that they think of what they’re doing, when they’re at the MLS screen entering data, as anything other than putting in information to get a house sold.

The popular and oft-heard response to this line of reasoning is, “Well, it’s both, Rob”. (Shortly followed by or preceded by, “You’re so black-and-white; the world is shades of grey, son!”) It is true that I tend towards black-and-white thinking, even if I recognize that in the real world of implementation, sometimes you have to tolerate shades of gray. But it is because without such clarity in thought, effectiveness in action is impossible.

Another way to think about it is from a prioritization standpoint. Fine, a MLS is both a data repository and an exchange. Which is its primary identity, and which is the secondary? Consequences follow from the answer.

If the MLS Is A Data Repository

Let’s say that the answer is that a MLS is a data repository. It may have begun as a way for brokers to cooperate in getting a house sold, but in this day and age of the Internet and sophisticated data analytics, the primary purpose of a MLS is to provide clean, accurate, timely property data to real estate professionals, consumers, and other users of real estate data.

Certain consequences follow this definition of the MLS.

  1. Syndication must be eliminated, except in cases where the MLS can make a reasonable business decision to do syndication, under its licensing terms, with varying degrees of control dependent on compensation.
  2. In fact, if the MLS is primarily a data repository, its membership agreements probably should spell out that it will be the exclusive provider of listings data, and that the listing brokers and agents will surrender their rights to send the same intellectual property to a different source. If I contract to write columns for AOL, I cannot then send that same column to Yahoo, unless our agreement says I can. The same analysis must apply to listings entered into the system by brokers and agents.
  3. Intellectual property rights, sharing of those rights, and various mutual licensing arrangements must be clarified and agreed upon by all participants, including the real estate agent who is actually doing all of the data entry. At a minimum, if the MLS is a data repository, and its subscribers are paying to create the valuable intellectual property that is being deposited into the repository, then some accommodation has to be reached between the MLS and the subscribers as to if, when, and how those content creators ought to be compensated for their efforts.
  4. The data that is being entered, aggregated, and re-sold/licensed needs to be examined for more than what it is today. There are hundreds of data fields in a listing that go unfilled because they’re not particularly relevant for attracting buyers to the property. But maybe those fields — like soil type, distance to power lines, etc. — are very relevant for sophisticated users of the data.
  5. IDX must be put back on the table for discussion.
  6. There has to be a discussion about the equal treatment that listing brokers and buying brokers receive in the MLS. The former creates IP that will be leveraged and monetized; the latter does not.
  7. A real discussion has to be held as to the minimum useful geography if a MLS is to be thought of primarily as a data repository. Real estate may be local, but data is not particularly useful unless it’s at a certain size. It’s impossible to do trend analysis on four closed sales in one zip code.

Each or all of these things can be modified, tweaked, or changed based on how strongly the secondary purpose of advertising a home for sale is to the MLS and its subscribers. But if the MLS is widely understood by all stakeholders and participants to be a data repository, the relationship between the brokers, agents, Associations, and the MLS will likely need to be renegotiated.

If the MLS Is An Exchange

If, on the other hand, all of the stakeholders and participants understand the MLS to be an exchange, created for the primary purpose of selling a home, then other consequences follow.

  1. Whatever value the property data might hold, that value is subordinate to the primary value of advertising a home for sale. Syndication must not only continue, but be expanded, and the propriety of charging licensing fees and other revenues at the cost of wider advertising distribution must be examined.
  2. The whole concept of data accuracy and data integrity has to be understood in the context of advertising a property for sale, rather than the context of third party users such as government agencies, banks, and academics.
  3. MLS rules and practices should be re-examined in light of the clarified understanding of the MLS as an exchange facilitating the sale of a home.
  4. MLS products and services that do not advance the primary goal of advertising homes for sale need to be validated by the leadership and by the subscriber membership.
  5. There has to be a discussion not of minimum geography, but of maximum geography. It isn’t logical to believe that if the MLS is merely an exchange, and real estate is local, then a super-regional MLS could serve the advertising function as effectively as a hyperlocal one.

Again, these consequences can be modified, tweaked, altered, and so forth based on the particular MLS’s stakeholders deciding how much to be influenced by the secondary function of data services.

My Take On The Issue

My personal take, after laying out the issues, is that a MLS is first and foremost an exchange, created for the primary purpose of advertising homes for sale. The exchange activities happen to throw off extremely valuable intellectual property as byproduct: accurate, timely, and comprehensive data on real estate activities. To the extent that the activity generates valuable assets — much like how fertilizer is often a byproduct of raising cattle — the MLS should attempt to control and monetize those assets. However, like any other byproduct, one would not impinge on the primary purpose for the sake of the secondary.

The whole syndication debate is far more complex and far more detailed, of course. And as I’ve mentioned, in the real world of implementation, there are going to be some grey areas. But I believe much of the confusion in the industry today around the issue stems from the fact that the big assumption has not been adequately communicated, discussed, or accepted by some of the main stakeholders: brokers and agents. Debate and settle the big question, and the details can be resolved using the clear understanding of primary vs. secondary purposes.

MLS Tesseract: A Listing Syndication Discussion

There has been a great deal of talk and writing lately about MLS/broker listing syndication. This isn't surprising considering one of the leading synidcation providers, Threewide, was recently acquired by Move, Inc., operator of At the same time, initiatives like those led by industry veterans Bud Fogel and Mike Meyers and by LPS's Ira Luntz (himself a veteran of syndication), suggest that a new model could be in the offiing. Among commentators who have taken up the issue are MRIS Chief Marketing Officer John Heithaus, in an often-discussed post; Victor Lund of the WAVGroup(you need to be in Inman subscriber to read that one); and Rob Hahn.

Elizabeth and I wrote a longish whitepaper on syndication back in 2008, which is still available on our firm web site. That paper, written just before Elizabeth and I formed our firm together, focused on the role of MLS and how MLSs should approach operational and legal issues; it assumed that MLSs would want to do syndication because at least some brokers wanted syndication. Most of the concerns we expressed then remained unaddressed in the industry. The current debate, it seems to me, is about whether brokers and MLSs should be doing syndication at all. It takes up the key assumption in our 2008 whitepaper.

I'd like to expend some effort and thought on this topic in the next few posts. This first one will define what I mean by "listing syndication," in order to distinguish it from other forms of listing data distribution and licensing; and it will discuss some reasons MLSs get involved in syndication. In the next post, we'll consider ways that MLSs get involved in syndication and some problems and issues. Then we'll take a look at the key underlying question: should brokers be sending listings to all these places in the first place, and what role should MLSs play in that decision?

“Syndication” defined

There is no official definition of “listing syndication.” There is no Platonic universal or form corresponding to listing syndication. So, we just need a practical definition that provides some scope to what we are talking about. For this summary, I will use the following definition: “listing syndication is the distribution in bulk of active real estate listings (listings currently available for sale), by or on behalf of the listing agent or listing broker, to sites that will advertise them on the web to consumers.”

We include each of the following things in this definition:

  • Distribution of listings by MLS through a listing syndicator, such as ThreeWide (ListHub) or Point2, to advertising sites.
  • Direct distribution of listings by MLS to advertising sites (including local newspaper web sites and national sites).
  • Distribution by a listing broker via a data feed (whether broker-internal or created by MLS on the broker’s behalf) to advertising sites.
  • Use by a broker or agent of a service that offers to take a bulk data feed and then distribute the listings to advertising sites.

I usually call web sites that advertise real estate listings “aggregators” (we used "commercial distributors" in the 2008 report). I usually refer to the recipients of data through syndication as “syndication channels.” As a result, a site like is both an aggregator and a syndication channel. I do not consider the following syndication (at least for purposes of this discussion), though they share some characteristics with it:

  • Services where agents manually load their own listings in, like vFlyer and Postlets. These sites generally do not take bulk data feeds.
  • Data licensing to RPR or CoreLogic under their current proposals. They are using off-market listings, in addition to active ones, and the applications they use them for are not advertising. Sending data to Move, Inc. for its Find application is not syndication because it includes off-markets; but see the discussion relating to that below.
  • A “back-office data feed” from MLS to the listing brokerage. A back-office feed often includes all the MLS listing data (from all brokers) and comes with a license for the brokerage to use the data internally for the core purposes of MLS and the freedom to use its own listings any way it pleases. Many MLSs provide such feeds to their participants to facilitate brokerage business activities. Thus, though MLS’s action here is not syndication, the brokerage might turn around and engage in syndication itself.

Why only active listings? We do not include off-market listings (listing records relating to properties not currently for sale) in our definition of syndication for two reasons:

  1. MLSs perceive the off-market listings as something different. Most MLSs recognize that very recent off-market activity in the MLS provides a very valuable resource, one not available elsewhere. Thus very few MLSs distribute off-market listings through typical syndication channels.
  2. Brokers perceive the off-market listings as something different. Brokers want their active listings advertised (and their sellers want it, too). But brokers rarely perceive value in having their off-market listings distributed. We are not acquainted with any brokerage firm that distributes its off-market listing data.

The value MLSs bring to syndication

Almost from the beginning of “listings on the Internet,” people have asked what role the MLS should play in getting listings out there. The short answer is efficiency:

  1. MLS already has all the listing content in one database. Every listing broker has already paid for that database to be created and maintained; and every such database has the capability to export listing data. In theory, at least, it should always be cheaper for MLS to ship brokers’ listings to a channel, because it requires fewer steps. In the alternative, MLS would supply each broker a data feed of its own listings, then each broker would have to set up a feed to each channel (or at least set up a feed to a syndicator who could reach the channels). That’s a lot more data feeds, IT staff hours, etc.
  2. Many MLSs and traditional syndicators permit broker ease-of-use. Syndicators like Point 2 offer brokers a dashboard where they can click on the channels they want to receive their listings and click off the ones they don’t want to receive them. In theory, this does not require the broker to perform research and due diligence on each channel; the syndicator or MLS has theoretically done that before presenting the option on the dashboard. (In practice, this may not be happening.)
  3. Syndication through MLS or a syndicator may give listing brokers more leverage. If a channel is getting the listings from many brokers in MLS through a data feed from MLS, the MLS may have leverage with the channel to get it to behave properly. If the MLS cuts off the data feed, the channel loses listings from all the brokers. Similarly, if a syndicator cuts off a feed to a channel, the channel loses the feed for all the MLSs working with the syndicator. A single broker, by comparison, usually does not have the volume of listings to exert leverage on the channels. Note that some channels (like Google, before it decided to stop accepting listings), did not necessarily react to that leverage anyway. Note also that just because MLSs and syndicators have this leverage, that does not mean they have actually used it (I'll discuss this in another post).

The resourceful broker problem

One important fact about syndication is what we call the “resourceful broker problem.” It’s not really a problem at all; it’s just competition. If an MLS does not syndicate listings on behalf of its brokers, some of the brokers will assume the costs and work associated with syndicating their own listings. This will give those brokers a competitive advantage in the market. Note that we don’t call this the “large broker problem.” Though the large brokers in markets are also often resourceful brokers, in many cases, smaller brokers also find the means to be resourceful. The MLS is confronted with its age-old problem, almost its nemesis: Choose between (a) delivering services at the lowest common denominator, drawing complaints from some brokers that MLS should be doing more to deliver efficiencies to all brokers; and (b) delivering efficient services to all brokers, drawing complaints from resourceful brokers that MLS is “leveling the playing field.”

Neither of these arguments is wholly right or wrong. But they appear in some form whenever MLSs consider offering services like syndication. The intensity of feeling about which path the MLS should take varies a great deal from MLS to MLS and often within the board room of a single MLS.

So, we've stipulated a definition for "syndication"; should we be including other things, or perhaps excluding something I've included here? And we've discussed why MLSs often believe they should be involved. I'm curious what your thoughts are about my efficiency arguments there. Next time, we'll consider some ways that MLSs do, and don't, syndicate. Following that, I'd like to spend a little time considering where brokers should be sending their listings and whether the MLS should be deciding for them.