What I learned this week: Data Skew

Disclaimer: In the spirit of full transparency, I learned about data skew a little while ago. But the whole point is “what I learned this week.” In some cases, “this week,” just refers to this week in time…like…last week, last month, whatever.

My first brush with NPSP was as a consultant. I remember very clearly thinking that some of the features would have been very handy for my B2B sales staff back in the day. In a lot of ways it was love at first sight. I still get prickly when people say mean things about it…

[Insert about a half hour of me looking for the best option for a “Don’t talk to me or my son ever again” meme before realizing there could potentially be a better use for my time.]

That said, the first time I started getting error emails at about 2am was ALSO around this time.

You know, this one:

Message: “First error: Update failed. First exception on row 0 with id 001……………; first error: UNABLE_TO_LOCK_ROW, unable to obtain exclusive access to this record or 1 records: 001…………….: []”

And I was flummoxed. What does that even mean? Why are you locking anything? Who said that you needed exclusive rights? And what does this have to do with merging records?

For a while I sort of…ignored it. Honestly it would run again at some point, right? It rarely happened more than once for the same record.

Sometimes I would have dozens of them. Usually right after some major data change or something. I suspected they were related, but I had other pressing concerns, and eventually everything would be sorted.

Over time I filled in the blanks. Unable to lock row meant that whatever the code was trying to do, it couldn’t get update access to the record.

If I spent more than 30 seconds on it, it made sense. A record cannot be edited by more than one person at a time, so why would it make an exception (ha ha – get it?) for custom code.

And then again, for a while, I left it at that.

Enter Data Architect Trailmix, stage right.

A super important part of the large data volume considerations that are discussed in the data architect arena is the concept of data skew. And as I read about it, I was taken back to a project early on, a move from the Starter Pack and a bucket model to NPSP with Household Accounts.

This client was looking to upgrade to the new success pack. They had been using the bucket model for YEARS – more than 50,000 contacts all inelegantly shoved into this single Account called “Individual.”

It was difficult to report on things. It took forever for the record to load.

I knew that there was a correlation, but I could not, especially at that time, explain what it was. I had a sense that having to many child records was a bad thing. I didn’t know what to call it. And I wouldn’t know, until years later, that that very situation was what caused errors during the overnight batch processing.

Data skew occurs when we have too many child records, plain and simple. It has an impact on loading time (you try showing a record and querying tens of thousands of records at one time), reporting, and…yes, automation.

It doesn’t exactly help me fix the errors all the time. Sometimes it’s just bad timing, and not even because of data skew. But putting a name to something makes it more accessible, less concerning.

Carry on, NPSP. Carry on.