Retaining talent - comp style

I’ll caveat this post with the acknowledgement that compensation and promotion aren’t everything, and comp especially is just one dimension in what I consider to be a fulfilling job. Other dimensions worth mentioning the same breath are autonomy/agency (especially technical autonomy in engineering roles), responsibility, visibility to leadership/external stakeholders/customers, and mission. I put mission last, but it should be first. The other dimensions — especially comp — can overwhelm mission for a time, but eventually it comes around. I mention these dimensions explicitly and up-front because I want to talk about comp, and every manager will say “comp isn’t everything” — especially when that manager is already very comfortable and doesn’t want to open a req! Nevertheless, here we go.

One of the most irritating things about my current employer is many of their requirements for promotion have a time-in-grade (TIG) component. My old job, for all its flaws, operated instead as promote-when-ready (PWR). In practice, since managers can’t drop everything they’re doing to evaluate whether you’re “ready,” this just meant (1) multiple promotion cycles annually and (2) no TIG requirement. I never saw anyone promoted faster than 1 year at a given role, but 1 year is pretty good.

Another important caveat to mention here is I’m mostly talking about “junior” engineers/performers, not partners or tech fellows or (Assoc) directors. You generally can’t tell whether such people are ready for be promoted “out” of those roles until they have a track record to warrant it, and track records at those levels have longer time horizons.

What we want to avoid is resentment that occurs from undervaluing great performers. If you think you’re ready for the next step but I don’t, then you’re within your rights to leave and I won’t fight to promote you just to keep you. On the other hand, if you think you’re simply undervalued with respect to the market, and your contributions are commensurate, then you have an argument for a compensation review. We may not want to increase your comp to meet market, but you have a fair expectation and I would certainly wish you the best of luck. However, those aren’t the cases I’m talking about. Here, I’m talking about great performers who are stuck at the “wrong” level for too long. Either they were brought in too low (happens sometimes, even with new grads), or their technical chops put them on a different slope/trajectory than what the company has in mind. Objectively, this is a great thing. The company has landed someone who will probably always be underpaid, but through a first-mover effect and stickiness of the job market, we’ve ended up with a great performer… now if only we can keep them…

This person won’t wait around for a 3-4 years promotion cycle, and they shouldn’t. There are very few arguments against promote-when-ready; the only one that holds water is experience-based, generally identifying and rewarding knowledge (sometimes tribal or domain-specific) that comes from time and multiple projects. An aspect of this is customer interaction. There are precocious but prickly people who need a little bit of extra allocation in this dimension — at least with respect to their otherwise incredible technical skills. However, even here I’ll defend the PWR approach. Some combination of employee self-awareness (knowing they need to work on a skill) and manager engagement should yield a coaching/mentoring endeavor to help this person train and level-up in the missing dimension. This goal-setting is a priority for two reasons. First, the employee needs these candid conversations (and the self-awareness that comes with them) in order to realize they have gaps to address, otherwise they will just resent the slow pace of promotion and leave. If they understand they have something to learn, and hopefully someone to learn from, they will work on that skill, realizing it will unlock promotion and another company would probably find them lacking in the same way. Second, managers should want to level up employees as quickly as possible, especially when their weaknesses are in a “social” dimension. This will help identify true talent that can handle anything. But additionally, it will start planting seeds for the type of future trajectory the high performer has. They may be someone who will struggle with customer interactions forever, in which case they are not suited for certain roles. That’s okay, but you (and the employee) want to know sooner; and besides, you can keep working on something you struggle with even when you know it’s a weakness — arguably these are the only things worth struggling for!

The lengthy aside above is focused on technical folks who lack some “soft” skill, but it’s true of hard skills, too — provided you have correctly identified the person in need of mentoring. Most times, a good engineer with “high agency” (cringe) will only need to be pointed in a given direction once you tell them they have a technical blind spot. But for some, it’s worth finding an impactful way of helping their journey if it will unlock more value (for themselves and ideally the company).

My personal example is when I started studying hardware and mechanical engineering more seriously. Coming from data science and then running software programs, I had a major blind spot in hardware/ME/CAD. My company doesn’t necessarily care about this since we have “other” groups to handle that, and I was at liberty to stay in my lane. (I was not discouraged from learning, but it wasn’t laid out as an explicit goal or challenge by anyone but me.) Despite this, I decided to start learning as much as I could about that side of the company and field more generally, reading some books and annoying the hell out of colleagues. They were generous with their time and patient, though I suspect they mostly wanted me to go away so they could do their work. But I was sick of not understanding hard limits on HW performance, thermal issues, why cables were always the longest lead item, and why we were perpetually understaffed in CAD.

My case was one (I hope) of identifying a missing technical skill in an otherwise technically strong engineer, and addressing it to make my contributions more holistic. I will never perform as well as an ME or CADder in those fields, but I understand much much more about that larger world of the company than I did before. And that has made me a better engineer, not just on the ML and SW side, but overall. It helps elucidate design decisions that were made long ago but sometimes seem archaic. The short answer to such situations is that sometimes those decisions were inevitable, but other times they were received wisdom and should be changed. Enough about me.

All that is to say that employees and managers both have responsibilities to work with each other to improve their performance, assuming the employee is a good fit the role. Compensation mis-alignment is real, but it does not excuse either party from trying to make roles more engaging and provide better opportunities for personal and technical growth. This is nothing new.

What started me down this path is having a desire to quantify the benefits of PWR over TIG. There are a few key assumptions here that are important to lay out.

The market is adequately compensating talent. This is function of salary, bonus, and equity. Where there is no equity (at the Raytheons of the world), it’s all compensation.
If you are working for sub-market comp, you are generating less value for yourself and your company. This manifests as resentment that will lead to changing companies, and your current company will incur a cost associated with replacing you. It also assumes you are not fully productive because of a latent (or exposed) resentment.
Your best route to financial gain is changing jobs as frequently as possible, assuming your skills are increasing. This has negative effects in the real world, but not in my models 😄

The base of the model for TIG and PWR is defined here:

The first thing we notice is that the promotion lag for TIG, as expected, is significant and potentially catastrophic. Here we ignore any contributions of lambda or theta to the promotion assessment, although those are obviously prerequisites (e.g. TIG is necessary but not sufficient criteria). Promotion under PWR, on the other hand, occurs at the first assessment event where the capability threshold is met/exceeded. So the lag depends on the capability growth lambda versus assessment period alpha — if lambda >> alpha then the capability growth will overshoot theta long before an assessment occurs. There is risk in this model as

For simplicity, we can assume that once an organizational grade is achieved the base comp matches the market, so there is no arbitrage between newly minted org grades and new hires (e.g. at time of promotion there is no monetary incentive to change companies — though there may be other reasons).

I wanted to look at the differences between these models and their manifestations in capability gaps. Again, there will be no surprise here, we know the right answer and are partially engineering the model with the expectation that PWR will perform better. The instructive part will be how large the difference is and whether we can learn anything about PWR+ market adjustment, including dynamics, from the model.

Here we see that TIG frequently under-grades high performers (almost by nature), while the risk generated by PWR is defined by sigma, the assessment noise. Since assessment is not perfect, it’s possible someone will be promoted before ready, and this is captured by C(t)+epsilon in the PWR rule. This is not necessarily trivial, but it won’t significantly impact our comparisons either (and in any case it’s a real effect: we all know people who were promoted before they were ready for some reason, maybe a fluke allowed them to succeed when they were not equal to the task).

Since we ignore annual (etc) raises, and we assume promotions result in market-adjusted salaries (removing current comp arbitrage), the result is captured almost entirely by cumulative losses from promotion lag E(Delta t) described above. But these only capture part of the picture since new promotions lead to new challenges and might help capability growth compound even faster. There is richer behavior that emerges when we allow capability C(t) to evolve as a differential equation and reset (in a way) with each promotion. Then, promotions not only result in raises but new growth rates lambda, lambda phi:

A “missed” promotion or utility gap is a lost opportunity to increase worth but also continue the capability compounding schedule that benefits the employee (comp) and the company (capability). This feed-forward mechanism will tie promotion cadence to capability acceleration.

As an aside, when I started this I was thinking that current (and cumulative) comp would be the objective measure. But then I started digging into how that Market function would work and realized it’s so tied to C(t) that enough information is already encoded, making Market/comp redundant. What we end up missing is the cumulative comp comparison between promotion regimes, but again we could always integrate C(t) and map that to salary if desired.

When I started plotting the above traces to get intuition for the different family/regime behavior, I was initially surprised because the TIG and PWR were largely similar. I forgot that, under conditions for normal individuals, PWR does not lead to drastically different results since their capabilities do not necessarily grow super fast: lambda C(t) versus theta. But I’m more interested in high-talent or high-capability engineers and how TIG prevents them from realizing their potential (for themselves and the company). Therefore, for the model I explicitly sampled lambda ~ U(0.15, 0.25) and for TIG T ~ N(3.6, 0.4). This ensures that individuals cross theta long before T so that TIG misses out on a promotion For longer compared to PWR (where alpha ~ N(2, 0.5) assessment events per year, so PWR individuals have 7-8 opportunities for promotion in the same T window.

I’m cooking the books a little here, but the reasons why are important. I’m mostly (only) concerned with high talent individuals whose lambda (capability growth rate) is above average. We have to assume that T is calibrated to fit in the bell curve of most individuals — that’s giving big companies too much credit, but it’s the only reasonable assumption for the model.

In the capability (top) and promotion event (bottom) traces above, we see that the “job hop” regime clearly outperforms the others, and then PWR robustly outperforms TIG. One caveat here is that I’m keeping the job hop largely as an indicator of the market. If you could optimally job hop, assuming each new role providing new lambda and theta parameters that allow you to accelerate growth, you would be optimizing for capability and compensation. In reality, this never comes to fruition because it hits a ceiling and at a certain grade level the expected capability is one that requires more time to achieve, including institutional learning and understanding the limitations of fields and industries. (The other reason it doesn’t work is because it signals bad habits — self-maximization — to hiring managers.)

When we sum the AUC for all traces, we find the TIG penalty is even greater than the traces indicate. Mean TIG capability AUC is 551 while PWR is 1679 (>3x). There some TIG outliers that are near the PWR mean, but that ignores the rest of the traces. For interest, job hop mean AUC is 4711, almost 10x TIG.

I think one of the main takeaways here for managers of talented technical folks is to ensure they have frequent opportunities to grow and show their capability — something to bear in mind especially if you work at a TIG company. For gifted engineers, be even more aware of this, and be proactive about your alpha and lambda , and your self-awareness.

Retaining talent - comp style

Read more

On asking stupid questions

Elon’s SpaceX Profit Motive

Co-ops to make residential solar more economical

Modeling obscured orbital volume