Thursday, April 12, 2012

HUH?: Making Sense of The Miami Herald Star System Dining Reviews

The starred review system was probably designed for left-brained, data-centric people (like me) to be able to digest the subjectivity of something like a restaurant or hotel review that are usually written from right-brained restaurant/hotel centric type people.

But there's been lots of chatter recently regarding the use of the star system.  The quantity and variety of reivews that use stars is immense.  Stars are used to rate everything from the caliber of your hotel in Madrid to the safety of your car.  The level of subjectivity of star systems depends on how they're used.  For hotels if rooms have 600-count sheets, marble floors, on-call butlers they'll get more stars.  If crash test dummies can walk out of an accident to tell bad jokes, that car gets more stars.  Dining reviews however, well, they're one of the most subjective reviews to offer a star system.  Does the uptown, top floor bastion of fine dining with tired, mediocre dishes deserve more stars than a downtown taco shack with the best tacos this side of Baja?

Progressive periodicals are dispensing with the star system for restaurant reviews (I'm looking at you LA Times).  On the other hand, certain French travel guides would cease to exist if they dropped their star system.  Which brings me to our local periodical, which basks in the star system as if it were an innovation on par with movable type.  Recently, the intrepid restaurant critics of The Miami Herald have gone on a tear of perplexing starred dining reviews that demonstrate they're complete lack of a methodology on how to dish them out.  Not only does there not seem to be a fixed methodology across the critics, individuals seem to not have their own consistent credo.

So what does an MBA do when trying to quantify the perplexing star system?  Well, he looks at the data, and given the Herald's propensity to dumb things down, they made it pretty easy.  A peek of each review appears in the Herald-owned (and gets aggregated with other local online reviews by Eater Miami) prior to  the full review running in the Herald.  These peeks take each review and compact it into bullet points (which I'm a huge fan of) titled "What worked" and "What didn't".  So here we have a great set of data points to use.  Each observation of the restaurant fits into one of these categories.  Steak was done to perfection = What worked.  Lousy service with long waits = What didn't work.  Easy as pie.

Then come the stars. Each reviewed restaurant is given a rating between one and four stars with one being, well, I don't know what it means because there's no legend to go by.  I have to assume one is bad and four is exceptional.  But a quick glance at a few reviews shows that a 1.5 stars denote subpar while a 3.5 star review = exceptional.  I must then assume that 2.0 stars = par and that 4.0 stars = exceptional + 0.5.

All's good so far.  We've got a list of "What worked" and "What didn't" at each reviewed restaurant as well as its stars.  But how do these two data sets relate?  Well, many times they don't.  Sifting through at the data I looked at all observations under the "what worked" and "what didn't" to get a sum of total observations.  I then took the number of "What worked" observations and divided by the total to get a "% of What Worked".  Overlaying this percentage with the stars assigned in each review gets you a picture of how in or out of whack the two are.  Examples of some recent reviews appear in the chart below with the blue bars denoting the star rating and the orange squares denoting the "worked %" of "worked observations" divided by "total observations".

There's few instances where the both data sets come out to similar results, but in most cases there's some pretty egregious differences.  Let's look at the extremes.  Victoria Pesce Elliot waxed poetic about Norman van Aken's Tuyo at Miami Dade College, giving the restaurant a  study-high rating of 3.5 stars.  At the other end of the spectrum is Blue Collar, also reviewed by VPE, which serves up unabashed American comfort food with sophisticated touches (and I mean Miami American where tostones and latkes peacefully coexist).   Blue Collar received 1.5 stars. Now looking at the "What worked %", Tuyo came in at a so-so 73.7% (14 worked out of 19 total observations) which was below all but one of the 3 star establishments in this study. Blue Collar, on the other hand, had a 50% "What worked %" which, when compared to restaurants at the 2 star level, surpassed one and tied the other.

The discrepancies become more egregious when looking at the subjective observations of these reviews.  Again, Tuyo, at 3.5 stars, should be considered one of the city's best; however, here's some of the dings from VPE. Would you think a 3.5 star establishment had "uninsipring, overworked pompano" or a "waiter who lacked a grasp of both English and the menu" and "a kitchen that can delay dishes"?  These observations absolutely scream 3.5, or near perfection.  Then let's take Kopas, a 2 star establishment that had a "What worked %" of 36.4%, lower than the 1.5 starred Blue Collar. Comments on Kopas' shortcomings included "It was a solitary pursuit. On both 8pm visits, we were the lone diners in a large, lacquered room with an empty outdoor patio and a sole waiter humming Pitbull tunes behind a shale-rock bar; chunks of fried meat and fish in soggy batter; an unidentified fish in classic tiradito; mushy shrimp ceviche."  So an empty restaurant with soggy battered meat and unidentified fish gets a passing grade whereas Blue Collar, an establishment which (i) is consistently packed by young (and old, despite VPE review stating that the clientele was of the young variety), (ii) where you can identify what's on your plate, (iii) there's no Pitbull to be heard and lastly (iv) has a higher "What worked %", got rated as subpar.

It's well known that newspapers across the country have had to cut back, our own Miami Herald included.  Staff reductions and mandatory furloughs have hit the morale of traditional journalists working at local papers.  I believe eventually the market will lead these papers to the right business models that will ensure their longevity.  But one characteristic that will survive is good, consistent reporting and content.  The methodology being employed by the Herald for dining reviews is epicly backfiring.  Perhaps a more factual style similar to the weekly "Fork in the Road" by Linda Bladholm (the only writer I tend to respect in the Herald's food and dining section) where there are no stars employed and no hyperbolic language to be found could bring some respect and regard to these reviews.


  2. The Herald's star system is just slightly less perplexing if you read the actual reviews in the Miami Herald (or rather than on Among other things, they give a translation (with each review, anyway) of what the stars mean. 1.5 = "Subpar", 2 = "OK", 2.5 = "Good", 3 = "Very Good", 3.5 = "Excellent."

    As far as those ratings consistently meaning anything? There are at least a couple issues.

    First, I'm not completely convinced that the abbreviated, bullet-pointed "reviews" that appear in (as distinguished from the full print versions of the reviews) are even written by the critic(s) themselves, rather than compiled by some editor / intern / robot (the same way that the headlines are typically not written by the author, so that they're sometimes wildly disconnected from the substance of the review). So trying to reconcile the star rating to the ratio of "what worked" and "what didn't work" may be a fool's errand, because I'm not convinced the bullet points - even though they're pulled from the review - are necessarily representative of the review as a whole.

    Second, the Herald now employs at least three different critics simultaneously - VPE, Jodi Mailander Farrell, and Enrique Fernandez. So there's no consistent "voice," nor is it even clear that they all agree on what the criteria are for the ratings. Of course, it doesn't seem even the New York Times (one of the most venerable of the "star systems") has a strict checklist of what accounts for each star level when the baton gets passed from one critic to another, but at least you only have to calibrate for the opinion of one critic at a time.

  3. Frod: I'll give it one shot to convince you of my argument before saying we'll have to agree to disagree.

    As much as you're convinced that the bulleted comments in come from someone other than the critic, my argument is predicated on giving these critics credit and assuming that they are either writing these bullets themselves or providing someone the notes to derive said bullets. Given it's the latter, I can't fathom any critic leaving it up so someone else, whether an editor, intern, etc. to come up with these quips using their name without at the very least reviewing them. If the reality is that someone other than the critic is entirely responsible for these quips without the critic's involvement then I think the methodology for these reviews is worse than I first thought.

    But going with the premise that yes, all critics write their own quips for, I'll go back to my favorite example of Tuyo. 3.5 stars were dished out by VPE for a restaurant that she said, "Though I am not usually a huge bread fan, I could not resist the perfectly crusty, gently chewy and slightly oily ciabatta squares powdered with flour," and "Next we were beguiled by a Joel Robuchon tuna tartare mixed with bits of tomato diced so small they might have been pepper flakes. A gently coddled egg on top provides the rich sauce to balance the fresh fish, with a scattering of matchstick-fried potatoes lending the crunch and salt." That's great, 3.5 stars because a restaurant got VPE to eat bread even though she's not a bread person and made a stellar tuna tartare that's the creation of another chef (sorry, I had to let some snarkiness out). The review admittedly was very positive and had I read it on its own would've thought Tuyo deserved 3.5 stars for being "excellent". But how can you get past the quips in the review attributed to VPE where there was "A waiter who lacked a grasp of both English and the menu" and also have "A kitchen that can delay dishes (kitchen is one floor below dining room)" and finally deal with a "Disappointing, one-dimensional deconstructed chocolate 'soufflé'"? None of these atrocities were mentioned in the Herald version of the review, yet how can you ignore that there were some fundamental flaws in service and execution and go ahead and award a restaurant 3.5 stars? You're putting this restaurant in the upper echelon of restaurants in the city yet there's obviously some basic things missing such as knowledgeable waiters and a finely tuned kitchen. Imagine Per Se having a waiter that didn't know the menu like the back of his hand, much less not able to communicate it because he didn't speak the language? And do you think Eleven Madison would be rated excellent if they had issues with timing in the kitchen? Nah, at the Herald things like this are overlooked, for the sake of what I'm still not sure.

    This entire exercise was done with tongue in cheek, but underlying it is a frustration with some of the Herald's dining section. I'm a big fan of Linda Bladholm's Fork in the Road and I get a lot of great information from the First Look posts on from Sara Liss and her cohorts. Both Fork in the Road and First Look are usually well written, lean more to the factual and never dish out a rating which is something the main critics in the Herald may want to learn from.

    Anonymous: you're right, any advice for getting people to care what I think and to not suck at my day job?

  4. You misunderstand me. All the "quips" and bullet points are directly paraphrased from the full reviews. (The two examples you mention about Tuyo are both in the Herald review: "with a kitchen one floor down, dishes can be delayed"; "His one-dimensional deconstructed chocolate “soufflé” was the only disappointment".)

    But I don't think that means the critics actually write them or even necessarily see them before they're published on; rather I think it's entirely possible that someone else edits down the full reviews and reduces them to bullet points for (as for why they feel compelled to do so, when one of the great things about the internet is that you don't have to pay to print more paper for a longer article, well, that's another question). And as a result, I think the "condensed" version is not always reflective of the overall tone of the full review, and in particular that just the relative number of positive and negative bullet points in particular will not be very informative.

    The fact that you read the full review and didn't even notice those comments sort of proves the point: in context, a couple minor quibbles don't undermine an otherwise very positive review.

    For what it's worth, it's not a "Joel Robuchon tuna tartare" - it's just the cooking method for the soft-cooked egg that is attributed to Robuchon (Robuchon was one of the first to play with slow-cooked low-temp immersion circulated eggs, see here).

    1. Even if someone else does write the quips on, from spot checks I've done, and the two you've pointed out, they're often word for word from the long review. I think your point then may be moot as to whether or not the author writes the quips because it's often their own words. Whether or not they review them is another story. But given the quips come directly from the long review, then the analysis I did is pretty valid, it's just someone from the Herald or doing the condensing and saved me, and anyone else reading these reviews some time.

      Cutting to the chase, could you justify that any restaurant that had issues with timing and pacing from the kitchen; waiters that lacked not only command of the menu, but the English language and a main and dessert that were subpar be rated as an excellent restaurant? If you can somehow justify it to yourself because the "overall tone of the article" meant that it deserved such then I'd be happy for you to be able to live with that delusion. I've made my point looking at these reviews from an analytical perspective, stripping out tone which, as has been shown in many reviews, clouds the review and attempts to justify either a higher, or sometimes lower, rating than the restaurant deserves by simply looking at facts and observations.

      As for the "Joel Robuchon tuna tartare", it was the critic that called it such, not I. I did make the foolish mistake of thinking the critic had identified the dish correctly, but alas I should have known better than to rely on a Herald critic for anything factual. The menu itself says that it's the egg that is Robuchon-style ( I'm sure the Herald appreciates you catching their factual error. And factual errors in these reviews, such as the one you pointed out, well, they probably deserve a study of their own.

