Do we still need recruitment agencies?

Article originally published on dev.to

As a candidate looking for jobs, I’ve never cooperated with any recruitment agencies. But as a senior developer responsible for tech interviews, I was forced to work with some HR companies and I had some really weird situations because of that. Sometimes it’s annoying, sometimes it just makes me laugh. Anyway, due to my bad experiences it’s hard for me to find any reason to pay commission to an HR/talent agency. I’ve always disliked “men-in-the-middle”. Yet, many companies seek talents through agencies in addition to their own headhunting struggles.

Let me briefly explain a standard recruitment process we developed in our company. We didn’t have an HR department, so the whole process was led by me and CTO. First, I prepared a job offer which would reflect our current needs. Then, the CTO posted that offer on various platforms. We received different resumes and reviewed them. If you did not have a meaningful GitHub profile, we usually asked to do our simple recruitment task: write a PHP shopping cart implementation for existing unit tests; kind of TDD, most people solved it under 4 hours. We also sent some technical questions to see if we don’t waste your (and our) time meeting you at the office. Especially if you lived several hundred miles away. If we liked your answers and the solution to our task, we would invite you to a personal meeting (around 1.5 hour) consisting of a soft interview and tech interview. Finally, you could receive an offer from us… and turn it down if you did not like it.

Simple, isn’t it? Just two parties making business with each other. We are the client and we buy services that you provide. We negotiate the deal and if everyone’s happy, just tell us when you can start. The description I mentioned is generic; we tried to approach every candidate individually according to the provided materials, proven experience and a lot of other factors. You didn’t have to solve our coding task if you found another way to show off. Fun fact: we’ve never cared about your formal education.

Unfortunately the resumes we received from job board users were not enough and because the company did not want to invest its money in greater headhunting endeavors like going to conferences or recruiting a full-time HR guy, well… it decided to use some help from external HR agencies. We would present our needs to the agency and it was supposed to find right people for the job. We received a written recommendation for every candidate. After successfully hiring an employee, the company paid commission to the agency.

How recruiting with agencies went wrong

I wouldn’t moan if those companies did their job well, but what I experienced instead was:

  1. The recommendations were generic and useless. Every candidate was described as a positive, pro-active team player who aims for personal development. Blah, blah, blah. If you stripped that BS, you would find out that you’re dealing with a mediocre coder with a boring portfolio. Both the company and the candidate wasted time talking to the agency because one way or another, we had to figure out most things for ourselves. The most interesting facts were those off the record.
  2. One agency used to save these recommendations as DOCX files. I work on Ubuntu and LibreOffice did not properly render those files, throwing some contents away from the page. I had to switch to Windows, launch Microsoft Word, prepare myself a PDF file and switch back. Eventually, the HR guys learned how to create PDF themselves. What a relief.
  3. The same agency stripped my recruitment task. It originally was a GitHub repo which consisted of some important files in the root directory (composer.json, docker-compose.yml, phpunit.xml.dist and – most important – README.md!) and a tests directory. You had to write a simple implementation which passed all of my tests. I surprisingly found out that all candidates sent by that agency rewrote composer.json on their own. I asked them: “Why would you do that? The whole autoloader was defined there!.” It turned out that the agency sent the candidates only the tests directory. They did not send the full repo, of course – the candidate could find out what company made that task. The target company’s name is top secret in the beginning – and I recklessly used it as a vendor part of PSR-4 namespaces.
  4. The situation with missing composer.json involved also one candidate which was so tired of endless meetings with an agency that – after learning what company is he applying to – he declined to cooperate with the agency and sent his resume directly to us. I didn’t know that story and I was surprised to see that he already solved our task.
  5. Speaking of endless meetings – that’s the thing I always hear if someone decides to share his or her experience with an HR agency. When you see a fancy job ad like “For our client, a market leader…” and you send your resume, you’re invited to an entry interview in the agency HQ. Then you receive our task to do at home. Then they invite you to another meeting… but not with us! If that meeting succeeds, you eventually make your way to our facility. You go there and probably hear the same questions you’ve already answered in the previous meetings, because the incompetent agency did not properly sum up your answers and we don’t have enough data about you.
  6. Sometimes we don’t receive a full resume, but only a solved task (kind of a blind recruitment). Sometimes we receive a resume with a surname washed out. Sometimes we receive only the recommendation and not the original resume.
  7. Sometimes a candidate is forced to visit the target company together with a guy from HR, who ensures that we don’t make a deal behind their back and we don’t abuse you. To me it’s like coming to an interview with your dad! He’s a man-in-the-middle. It’s not a deal between an employee and an employer. Fortunately, your “dad” stays only for a “soft interview” and he walks away during the tech interview (which I lead). He goes for a coffee and waits there until we’re done. So basically he gets paid for drinking coffee in our office.
  8. It’s funny to see how an extravert, upbeat HR guy brings a terrified candidate to the meeting. At the first glance I would hire the HR guy, not a shy, scared dev.
  9. My coworker received an offer for my position from an agency when I decided to leave the company.
  10. My boss received an offer from an agency to hire a colleague who was just about to leave the company.
  11. When you get hired and work for a while, the agency calls you from time to time to ask how is it going. That’s what my coworkers told me. As an employee, I would find this annoying.

I have no way to convince my soon-to-be-ex employer to stop wasting money with HR agencies (or invest in the right one). But I still wonder why good devs in Poland respond to the job ads posted by agencies where you don’t have a target company’s name disclosed. The salary range is not specified either. If I invest my time talking to the HR guy, I would like to know in advance who am I going to work for and what salary I can expect!

Carving my own path

Like I said in the beginning, I have never looked for a job through an agency. I have some basic googling skills, but let me show you a brief story of my career which provides some more tips:

  1. I met my first employer during high school. It was a local news company. My friend had already worked for them and he asked if I could make some photos at local events. Still being a student, I started my humble photography career in my spare time. After my high school exams (Polish matura) and before going to university, I got a permanent job offer. They discovered I can code PHP for food, so they hired me to work on their new website during the day AND make photos in the evenings. Weird setup, but this job gave me a lot of life experience.
  2. After five years, I met my future girlfriend. I decided to relocate to Gdańsk which is ~300 km away from my hometown. I started browsing pracuj.pl, a Polish job board and soon I found an offer from an education company. I sent my resume, did a recruitment task and after three weeks I got an offer! I’ve been working there for over four years. During that time, my salary has doubled and I went from a mid developer to a team leader.
  3. Me and my girlfriend decided to change our lifestyles and travel more. I needed a full remote job. I remember meeting a nice software house at a PHPers conference in Poznań. I sent my resume. Unfortunately they just stopped their recruitment process at a time. But after a month I received an e-mail inviting me to a new process, which I passed successfully. I had three interviews via phone and Skype: entry, tech and soft skills. I really liked all those interviews. One of my interviewers said he saw my PHPers presentation about database optimization and he already knew my name. I got the offer with a salary which allows me to live decently, buy more music equipment (oh yeah!) and even save money for retirement.

As you can see, my ~10 years career as a developer did not include any deals with any recruitment agencies. It is important to say that I’m not an easy-going, upbeat and extravert person. I’m not good at getting my foot in the door, but at least I’ve learned how to create my resume properly, find a possible employer and make him interested in my offer. I’ve spent a lot of time browsing Polish nofluffjobs board where all the offers are plain and simple, with salary specified upfront.

What’s more to do? I guess I should visit more conferences and give more talks, so that people will know my name. I should write more blog posts and possibly contribute to Open Source. That way I’m going to develop my personal brand and hopefully do my business stuff without the HR agencies.

If you work in an HR/talent agency, please improve your skills. I know that IT headhunting is very hard and it might be really frustrating to abuse LinkedIn for the whole day just to receive a bunch of rude replies (or no replies at all). But if you want to do a good job as a headhunter, you need to understand how candidates and companies behave and what they really need. We’re all here to make business happen, right?

Is a managing position good for me?

When you’re an engineer, sometimes you might get an offer to take over management duties. It might be for example leading smaller projects, being a scrum master or leading a whole team of developers. When to consider such options and how to get prepared for new challenges?

I assisted my team leader for two years in his management duties. Then I became a leader myself and rised employment to 8 developers. I learned a lot about working with people. After a year I decided to change my job and have some rest. Having some perspective now, I decided to share my experience.

Do I have to be a manager to get a raise?

It depends on the company policy. Management is not for everyone – you have to feel it and like it. Some companies know it and value both talented engineers and managers the same. Other companies give more value to employees who are eager to step out of line, take the wheel, lead projects, watch the budget and the deadlines. It’s good to make this clear during recruitment talk.

What do I get?

It’s nice to have a fancy job title in our e-mail footer, but it should not be a main reason to take a management role. I can see some other pros:

  • real influence on the way the team works and the projects are managed
  • soft skills development: working with people, knowing their personalities, d1eveloping non-verbal communication, resolving conflicts, keeping good atmosphere
  • satisfaction from mentoring and supporting others; you have a big influence on their careers
  • you will get credit for your team’s successes from the stakeholders

Do I fit into this role?

Think about what kind of boss would you like to have. And then become such a boss.

Programming is a team work and it requires trusting people, being a team player and taking responsibility. This is why giving commands does not work well here. We assume that a development team consists of intelligent, mature and open people – and we have to treat them like this.

As an employee, I’d like to have a boss who:

  • has a sense of humor 🙂
  • is open to discussion, does not force his will – but can have a deciding vote when needed,
  • respects other opinions, fosters having a dialog,
  • eases conflicts instead of making them worse,
  • has a positive approach, but also keeps both feet on the ground,
  • evaluates my work in an honest way,
  • clearly communicates his intentions and concerns,
  • shares experience and motivates others to move forward,
  • delegates tasks, does not leave the best for himself (herself),
  • doesn’t force anybody to work overtime,
  • is not a control freak.

Start with small steps! You can always take the initiative in your team even if you don’t have an official manager’s title. Help people in their everyday struggles and propose solutions. Actively participate in retrospectives and other meetings. If you see that your coworkers really appreciate your support, invite to discussions, share ideas and respect your opinion – it’s a sign you’re a good material for a leader.

How can I keep my development work while being a manager?

If you take a managing job, you’re going to lose some time spent previously on developing things and spend it on managing different relations with a lot of people. Your workday still has 8 hours (please resist the urge to work overtime!) and you have to allocate your effort wisely.

It might get hard for you to find time to peacefully finish your code, write good tests and refactor. You’ll find yourself delegating your favorite coding tasks to others (and that’s a valuable management skill!). You’ll notice that your teammates learn new technologies faster. After a couple of years you might feel staying behind the competition because management duties take most of your time. If you decide to change your job and switch back to coding full-time, you might have trouble following the latest trends.

Me and my fellow managers always had a favorite part of the day, like early morning or late afternoon, when we could do some coding in silence, completely undisturbed. However, turning the lights on (or off) every day in the office is not a good solution. It’s crucial to give the team a clear message when you’re available for them and when you should have some peace. You can always say “I’m sorry, I’m busy right now, give me a minute and I’ll get back to you”. Also, try to educate people, so they can find a solution for themselves – don’t do their work.

Having more duties requires refining your self-organization. Take care of your desk, mailbox, Slack conversations – keep these places clean. Eliminate distracting factors, configure notifications from different apps to avoid notification fatigue. Know your daily rhythm – your best and worst working hours.

Warning signs

Sometimes even though you’re doing your best, things aren’t working the way you expect. Maybe there is something you cannot influence? Sometimes it’s better to turn down a bad offer than waste your time and stress. Be careful if:

  • many people quit their jobs and you received an offer only because no one else wants to take it,
  • the company is trying to put too much tasks and responsibility on your plate,
  • the atmosphere in your team and in the company is bad and no one can see any solution,
  • the company always cuts the corners when you see an urgent need of investments, hiring more people or giving them raises,
  • you feel like you won’t get any support from other, more experienced managers and you’ll have to figure everything out yourself.

Some companies are just bad. Don’t let them overload you with work that’s not yours! If you see your employer cutting corners all the time, you’ll soon find yourself in charge of managing projects, meeting deadlines, talking to clients, recruiting people, mentoring, reviewing code, coding itself… You can’t do all these things simultaneously.

Keep your head up!

Managing an 8-person team was not easy for me. It required a lot of energy, creativity, perseverance and optimism. I made a lot of mistakes. I was tired and I decided to go back to coding full-time and make up my technical abilities. However, leading a team was a very valuable experience from me and maybe some day I will consider a similar offer again – if I get one.

If you’re not a complete outcast and a loner, maybe it’s worth trying to have some management duties. Even without being a formal superior, you can influence your team by being proactive. You should learn how to deal with people, manage your time and take responsibility for the business that hires you. Don’t avoid new challenges!

Take a look at this wonderful talk about leadership

When your SQL database is missing foreign keys

…then sooner or later, you’re going to have a bad time. Bugs in your app or users’ recklessness will cause your database to be inconsistent.

An example from my job: a system had users and users_categories tables. While registering their accounts, new users entered not only an e-mail address and a password, but also selected categories, like teacher, student, parent. The data were immediately inserted into a MySQL database. But the account had to be activated via e-mail. A script was executed every day to purge inactive accounts. It wiped records only from the users table, not users_categories. There were no foreign keys to block that behavior.

Every company has some erroneous legacy code here and there. I saw databases reaching 60-70 gigabytes in size, with hundreds of millions of records and not having foreign key constraints because someone… was afraid of them. A long time ago, one database in the company had ON DELETE CASCADE foreign keys, which means deleting one record caused cascaded deletions of related records. My colleague destroyed almost half of a test database, so he decided to remove foreign key constraints. That was a classic misunderstanding of the technology that we were using.

How a foreign key constraint works?

In a relational database, foreign keys are meant to secure the references between entities. For example, in a users_categories we have two relations: with users and categories tables. No row in the users_categories table cannot reference a non-existent user or category.

The only exception is if a row contains NULL values. This is a way to create optional relations.

Creating a foreign key usually looks like that (based on InnoDB engine in MySQL):

ALTER TABLE users_categories
ADD CONSTRAINT fk_users_categories_user_id
FOREIGN KEY (user_id) REFERENCES users (id)
ON DELETE RESTRICT

We create a constraint with a specified name (we don’t have to, but it’s good to have a name that we can later use during reverting a migration). We specify a column which should be restricted and a destination table and column which we want to refer to. In the end, we decide what happens if we try to delete related data. A default behavior is to restrict such query and issue an error, for example if we try to remove a user without removing his (her) categories list. We can also decide to have a cascade deletion or set NULLs, but I always choose the safest option which is to RESTRICT.

Foreign keys require that:

  • both columns (the one we create a constraint for and the one we refer to) must have exactly the same type; a SMALLINT column cannot refer to INT,
  • there must be an index in the source table for the column; it can be a multi-column index starting with that column; if no index is matching, a new one will be created
  • a constraint must have a unique name in the schema scope

Introducing foreign keys in an inconsistent database

If you have some existing data without foreign keys, you need to clean it up first. That’s a challenge because sooner or later, a database without foreign keys will be a mess. MySQL will not allow us to create a foreign key on wrong data – unless we issue a SET FOREIGN_KEY_CHECKS=0 query before adding constraints (that’s what mysqldump does by default). However, we would like to have proper data not only until now, but also to fix what we already have. We need to somehow untangle that existing spaghetti.

I decided to analyze existing data to know:

  • how many records with wrong references do I have
  • what strategy will be the best to fix these particular records
  • are there differences in column types

On every table I executed a query showing all records with wrong references:

SELECT * FROM users_categories WHERE NOT EXISTS (
SELECT * FROM users WHERE user_id = users.id
)

In the example above, if the user_id column would allow NULL values, we should add a following condition: user_id IS NOT NULL.

I noticed a funny trend between developers not using foreign keys: they define optional references as, for example, INT NOT NULL DEFAULT 0. This is not correct because usually, in referenced tables with AUTO_INCREMENT or SERIAL primary keys, there are no records with id = 0. So introducing a foreign key in this case will not work. I had to modify the table schema and then change all 0s to NULLs. Let’s say that a user might have an optional reference to a city:

ALTER TABLE users MODIFY city_id INT DEFAULT NULL;

UPDATE users SET city_id = NULL WHERE NOT EXISTS
(SELECT * FROM cities WHERE city_id = cities.id);

Here we can see a strategy of setting NULL every time we cannot find a destination entity. We don’t want to remove a user’s record just because it refers to a non-existent city. NULL means a value is unknown, uncertain (it’s not the same as an empty value or 0). If we have an incorrect city_id value, we cannot specify which exact city it refers to – so we set it to NULL.

Other strategy can be taken for many-to-many join tables. In the users_categories table I mentioned, if we have invalid user_id, category_id or even both – this means that the whole record can be removed:

DELETE uc FROM users_categories uc WHERE NOT EXISTS (
SELECT * FROM users u WHERE u.id = uc.user_id
) OR NOT EXISTS (
SELECT * FROM categories c WHERE c.id = uc.category_id
)

When you pick a certain data cleanup strategy, you need to know the business well. Maybe the wrong data is used in some reporting systems and if you suddenly fix the stats, it will confuse people because they relied on these data for a long time. They might perceive your ingenious fix… as a mistake 🙂

Fixing complex cases

Sometimes, relations between entities form a long chain: for example, a user puts and order which consists of several items, and every order item refers to a product, and every product… has been added by some admin user. At first you should check the most generic entities (like products) and then dig deeper into orders and order items. Take a look at the following steps to see which strategy I pick in different scenarios:

  1. Check if there are products added by non-existent users. An information about a removed user is not necessary for the system to work. A wrong user_id can be set to NULL.
  2. Check if there are orders with wrong user_id. An orders history is essential for the legal, tax and reporting purposes. We cannot remove any orders, so we just set user_id = NULL if a user does not exist (maybe he was not logged in?).
  3. Check if there are any order items related to non-existent orders or products. Items contain important invoicing data like net and gross price. But if a report requires a proper reference from orders_items to orders table, and we have queries like SELECT … FROM orders JOIN orders_items …, then a JOIN clause eliminates wrong records anyway. So we can remove items that refer to wrong orders. The same situation applies to product_id. We won’t prepare sale reports for products which do not exist.

Of course before executing such dangerous queries we need to have a backup. And while we analyze the details of the system we’re trying to clean up, we should consult as many people who have a business knowledge as possible.

I know it would be nice to have an automatic script to clean up all the data before adding constraints. But there is no silver bullet because every case is different. You need to use your creativity and intuition! It’s not easy, but in the end, satisfaction will be great.

Some interesting, further, external read:

6 Things That Can Ruin Corporate Trainings

How to gloriously waste three days of a 10-person development team? Send them to a wrong training led by a wrong person. As a team leader I made this mistake twice. There will be no third time.

One training was about microservices and the second one about refactoring and design patterns. We observed some really annoying anti-patterns of a technical training:

  1. Lack of preparation for practical exercises. You could say, “slides are cheap – show me the code”. And the code was bad. Our coaches had problems with IDE and dependencies. Their development environment differed from ours. We all strugged with proper configuration, wasting hours sitting on StackOverflow. It destroyed students’ motivation and coaches’ authority. We could avoid these problems if we earlier agreed to use the same environment and if the coach would check his scripts the day before.

    The exercises were chaotic also because both coaches did not really have an idea what exactly they want to do. First we created a class, then removed it and created another one without a clear purpose. Then we set up a whole different project from scratch. We wrote a lot of “example” code which eventually was moved to trash anyway. It could not serve as any aid or template during our work.

    I think we should have had a clear plan for the training and work on one codebase during all three days, slowly improving it. A clear goal should have been set.

  2. A coach does not allow discussion. This is bad because you can develop stuff in many different ways. A coach should not talk from a “guru” position. Maybe still he could learn something from his (her) students? A coach should not cling on to the agenda if he sees that students have other ideas and needs. He should not treat students’ feedback as an attempt to underestimate his authority.

  3. A coach demands 100% focus all the time, from everyone. I saw a coach who loudly reprimanded a girl who was handling company texts during training. She politely explained that it was an emergency. You can always have an emergency if you leave your company for a couple of days. The second thing is that every human has different hours in which he or she has most attention.

    As a coach, you should recognise symptoms of fatigue or lack of interest. Maybe your students need a coffee break? Maybe they disagree with you, but you speak so loud and fast they are afraid to interrupt you? Ask the group if everyone is okay and if anyone needs a break.

  4. A training company hires a coach who uses certain technology only from time to time. The company promised us they would craft a special training for our custom needs. But we had no idea that they would force a JavaScript coach who had “some previous” experience in PHP to conduct our PHP design patterns training. It turned out that we were the ones who taught our coach, not the other way around.

  5. A coach does not admit mistakes. If students tell you that you are wasting time, trying to configure some tool by copying and pasting from StackOverflow for two hours – maybe they are right? Maybe you should let go? You didn’t prepare, you’re not handling it – it’s okay, we’re humans. Just apologize and admit you made a mistake, try to move on, adapt.

  6. A coach makes an unnecessary barrier between him (her) and the students. He is too formal. Come on, developers are cool guys, don’t call them “mister” and “miss”! Let’s treat a training like a place to share knowledge and have a discussion like colleagues do, not in a “master-student” relation. So as a coach, don’t run away during a break. Have a chat with students, be available, share some funny stories.

    However, don’t try to be too cool, too “homie”. Be yourself, be authentic. You won’t build trust and authority by trying to be someone else. Also, don’t try cheap motivational talks, be specific.

My last advice for anyone who tries to train people: show yourself at conferences. Build your own brand, practise your talks, prepare for the hardest questions from the audience. Build your authority by showing your skills and openness.

Why we adopted a coding standard and how we enforce it

Everyone might have their own code formatting preferences. Problems arise when a team consisting of a couple of individuals work on a common code base and every developer has different preferences. It’s hard to maintain pull requests in this situation.

The most ridiculous and unproductive quarrels in my career were about whether we should use tabs or spaces; should we place a curly bracket in the same line or another; should we leave a blank line at the end of file; etc. It’s mostly a matter of a personal preference, however both sides had some interesting arguments.

Once I told my team that we should adopt common code formatting rules for PHP. It was clear to us that we had a mess in our repos. I asked if we wanted to waste our time endlessly discussing every formatting detail. This way we adopted PSR-2 standard.

Of course we couldn’t just reformat all our repos at once. Every one of seven PHP developers already worked on their branches. We had new things merged to develop or master a couple of times per day. This is why we were reformatting our code piece by piece. We were choosing the best moments to do a global reformat in PhpStorm. It was easiest when we knew that only one developer at a time was working on a certain repo or module.

It is important to place a code reformatting operation in a separate commit. You shouldn’t mix formatting changes with functional changes because it makes browsing pull requests difficult. Some Git GUIs can hide whitespace changes but they cannot hide syntax changes like array() to [].

Automatic code checks and analysis

To ensure that our coding standards are met, we set up PHP Code Sniffer in some repos. It verifies that the code complies to the PSR-2 coding standard. After every push to the central repository we get a message about formatting errors. Our testing and analysis tools are automatically launched by the CI (Continuous Integration) tool, for example GitLab, Jenkins, Travis, CircleCI, Bamboo.

You should notice that phpcs by default enforces that no classes remain in the global namespace. It’s good because, believe me, code can grow really fast and it becomes more and more cumbersome to understand the sophisticated code base without proper namespace structure – best done according to PSR-4.

We can try some other static code analysis tools as well: PHPStan, PHPMetrics etc. They do very good job of finding basic and most common code flaws. During the pull requests we can focus on checking advanced business logic because we know that the code formatting and basic code smells were already checked automatically. It’s important especially for dynamically-typed, interpreted languages like PHP where there is no compilation process. Also, PHPStan helps us prepare the code for new PHP versions.

Learn how to count money… or you will lose it

Does anyone know where does the following difference come from?

$ php -r "var_dump((int) (4.20 * 100));"
int(420)

$ php -r "var_dump((int) (4.10 * 100));"
int(409)

It sounds weird when I get a ticket like this: When I set the price to $4.20 everything’s fine. But I cannot set the price to $4.10 because the system shows $4.09. I did some research and discovered that a user entered the price in dollars and then we converted it to cents to store in the database as an integer. That’s where the mistake was made.

Those problems arise from the way that CPU stores floating point numbers. PHP does not have a built-in decimal type, so it uses IEEE-754 standard. In this standard, numbers are stored in a binary system; both the mantissa and the exponent.

You can play with converting different numbers here: IEEE-754 Floating Point Converter. You can see that the number 4.10 in fact is stored as 4.099999904632568359375. When you multiply it by 100, you get around 409.99999. Casting to (int) causes dropping the fractional part (not rounding), so we get the number 409.

How to operate with currencies in PHP? The best way is to use decimal types or dedicated currency classesMoney is a classic example of a value object as a Domain-Driven Design building block.

If I had a dime for every time I’ve seen someone use FLOAT to store currency, I’d have $999.997634 – Bill Karwin

How unit tests help changing existing code

You should have some tests. Why? Because developers are afraid to make changes in a code they don’t understand. This is a common problem not only with fresh employees, but with everyone who stands in front of a huge, complex, legacy system. They are afraid to add a new if not to break the other ones. They are afraid to erase code that seems obsolete.

Instead of refactoring, people tend to add classes and methods with suffixes like _new_2 and so on. I saw the same fear of changes in QA and op teams. When I try to refactor things, I often hear: Oh, maybe not now in case if something breaks, maybe later…

Overcoming the fear of changing an unknown code

I had a problem changing a piece of complex code adding symbols to online transactions. The symbols depended on product types, legal issues, invoices etc. The sales dep coined some weird terms which developers didn’t understand, and those terms were very important for them. I was told to add yet another weird symbol on top of that. The original code looked like this:

public function getDescription()
{
if (\in_array($this->getItemsType(), [self::ORDER_MATERIAL, self::ORDER_MIXED], true)) {
$type = 'DW';
} elseif (self::ORDER_COURSE === $this->getItemsType()) {
$type = 'SzO';
} else {
$type = $this->hasInvoice() ? 'M' : 'P';
}

return sprintf('Order #%u / %s', $this->getId(), $type);
}

After analyzing the original code, I wrote a unit test for all cases:

use PHPUnit\Framework\TestCase;
use Piotr\Blog\Entity\Order;

final class OrderTest extends TestCase
{
/**
* Test data will be provided by ordersProvider().
* @dataProvider ordersProvider
*/

public function testDescription(int $orderId, int $itemsType, bool $hasInvoice, string $expected)
{
$order = new Order($orderId);
$order->setItemsType($itemsType)->setHasInvoice($hasInvoice);
$this->assertEquals($expected, $order->getDescription());
}

public function ordersProvider()
{
return [
[123, Order::ORDER_MULTIMEDIA, false, 'Order #123 / P'],
[123, Order::ORDER_MATERIAL, false, 'Order #123 / DW'],
[123, Order::ORDER_MIXED, false, 'Order #123 / DW'],
[123, Order::ORDER_COURSE, false, 'Order #123 / SzO'],
[123, Order::ORDER_MULTIMEDIA, true, 'Order #123 / M'],
[123, Order::ORDER_MATERIAL, true, 'Order #123 / DW'],
[123, Order::ORDER_MIXED, true, 'Order #123 / DW'],
[123, Order::ORDER_COURSE, true, 'Order #123 / SzO'],
];
}
}

I checked code coverage for this method, by curiosity. It was 100% which made me confident that all lines are executed during the test (however, it might not ensure that all cases are checked). Now I was ready to add another condition to the getDescription() method:

public function getDescription()
{
if (self::ORDER_VIDEO === $this->getItemsType) {
$type = $this->hasInvoice() ? 'W' : 'WP';
}
/* ... */
}

I ran my unit test and received no errors. Success – I didn’t break anything! Since I added some extra lines to my class, the code coverage dropped. I needed to add new test cases:

public function ordersProvider()
{
return [
/* ... */
[123, Order::ORDER_VIDEO, false, 'Order #123 / WP'],
[123, Order::ORDER_VIDEO, true, 'Order #123 / W'],
];
}

Test is passing and my code coverage is 100% again. Now I know that another developer who takes this code over will have a trustful test checking all conditions.

Make sure your business works!

The above example is easy. Unfortunately, in every day job writing unit tests is hard if we need to deal with a poor system architecture, tight coupling and… managers refusing the team to spend extra time writing tests. Moreover, unit tests alone might not be sufficient – you would want integration and UI tests also, and it takes time. But, as Robert C. Martin ironically pointed out once:

Can you imagine telling your users: You know, I don’t write tests. I just write the code. Sometimes it even works. And I ship it to you, and if there are bugs, you’re going to tell me, aren’t you?

Your company might not need a 100% code coverage – often it’s very difficult and even unprofitable (some people claim it’s dangerous). From the business perspective, we should write tests that protect the key business issues first – and that’s usually the domain logic. We want to make sure that none of the business rules that we agreed upon will break after deploying new features. We want to make sure that users will still be able do place orders, and all the orders will be properly accounted.

However if you want to write tests for business rules, then those rules have to be clearly written in the code. This is often cumbersome – and I’m going to tell you about it, some day.

How legacy code is made – part 2

In the previous post, I told you how tight coupling causes systems to be hard to maintain and expand with new features. Today I’m going to show you how developers often take shortcuts which then add up to the technical debt.

I worked on a big e-commerce system for one of the leading Polish education publishers. When I joined the team, there were already many crazy workarounds in the system. I found one particular gem which was really funny and perfectly showcased the absurd of taking ridiculous shortcuts. Let me describe it.

There was a feature for wholesale customers to import a CSV file with all the products they wanted to order. The import script read the CSV line by line and placed every product into the order. Every line started with a product code consisting of several letters and numbers. The Polish alphabet, apart from standard Latin letters, contains 9 letters with diacritics (accents). They are not contained within the basic ASCII set, so at some point the team ran into character encoding issues.

This ticket was found in the issue tracker:

Issue title: Cannot import CSV file with diacritics

Issue description: The import function do not recognize and RŁ2 product codes. Under Windows, CSV files have a default CP-1250 encoding. The database uses UTF-8 encoding. For these two product codes, we need to add special conditions.

Notice that the last sentence of the issue description is a trap; it suggests creating a workaround, some special case in the code. One PHP developer fell into that trap and implemented a fix like this:

while (($csv = fgetcsv($fp, 256, $delimiter)) !== false) {
// for CP-1250 files we convert Ł letter to UTF-8
$csv[0] = str_replace("\xA3", "\xC5\x81", $csv[0]);
// ...
}

Now imagine that we have some product codes with another diacritic letters. What would an average developer do? Add another special case for converting that character? It is really tempting to do just that, then commit the change and forget about it.

This is the problem with workarounds: people easily follow them instead of solving the problem properly. Not everyone has the courage to modify the code they did not write. People think, “If it’s been around here for several months or years, it must be correct. Someone who did this must have known what he or she was doing.”

In this specific example we could simply use PHP’s iconv function to solve the encoding issue properly. So it’s hard to say that anyone has actually made a shortcut here; in fact, the original solution was a much longer walk than it should be.

Don’t rush! You won’t save time!

We all do bad things when working under pressure of deadlines. But often a hurry makes us actually work slower. Trying to take a shortcut and creating workarounds has its consequences which then are stacked to something we call technical debt. Think, research and ask before you code.

New developers who join the team do not know the project and its practices. They just follow examples in the existing code base. This can lead to funny and awkward situations. We should spend more time on leaving good examples for our future coworkers (and ourselves). If we can’t make a good code before the deadline, we should clearly mark our workarounds and schedule some refactoring as soon as possible.

How a legacy code is made

“The (…) problem with legacy assets is how difficult and risky they are to invoke. They were developed with a particular feature-set in mind, and are highly coupled to those features. This makes it hard to access what you want directly without going through layers designed for very different purposes. The data you want is nearly always intertwined with other data and functionality, so that it is hard to detach just the part you want without dragging along an endless tangle of interactions and associations.”

Eric Evans, “Getting started with DDD when surrounded by legacy systems“, 2013

I received an apparently easy task to do. A head of sales wanted to receive a list of all products sold in one of the departments. She expected a table with just three columns: product name, default rebate name and rebated price. Almost every product had some rebate attached to attract clients.

This task should take maybe 15 minutes. Or maybe I should not even do it at all. Salespeople should have a reporting feature for that. However, it turned out that a mechanism to fetch products with prices was strictly tangled with the way these items were presented in the shop. I couldn’t retrieve all the products at once because I was forced to do it page by page (maximum 50 products per page). The rebate system worked only for a currently logged in user and everyone could have different rebates based on the account settings. I could not easily simulate rebates on another accounts. Eventually, preparing a rather simple products summary took three hours of writing a very dirty code…

The shop was custom-made to meet strict specifications oriented towards an ordinary web browser. From an end user’s standpoint, a shop should have a paginated list of products, a shopping cart, rebates info etc. Just some standard e-shop features. No one thought about additional use cases that might show up some day – like salespeople asking for summaries and reports. No one thought that some day we would need to share data through an API endpoint. Deadlines were coming after us. All developers just wanted to get it done according to specs. A legacy, proprietary PHP framework did not make it any easier.

That way, we created lots of code which became “legacy” and inefficient just after half a year past deployment.

Separating code responsibilities into layers

To handle such different requirements, a multi-tier/multi-layer and hexagonal architectures were invented. Sometimes, simple CRUDs shown in framework’s documentation are not enough; complex applications need a clear separation between a business/domain logic, infrastructure and presentation. We create a network of loosely coupled modules which, thanks to interfaces and dependency injection, can be easily switched and replaced. This approach has a lot of pros:

  • Every layer can be, to some extent, developed independently. Because of abstraction, the business logic does not depend on a particular way of presentation, page rendering, data storage etc.
  • Every layer can have a separate set of automatic tests.
  • While editing the code of one layer, we are not distracted by the code of other layers.
  • The same data can be presented in multiple ways. We just attach a different presentation layer implementation (HTML, PDF, JSON API, CLI…).
  • Business rules are clear. We don’t need to look for them in strange places, like templates.

“Allow an application to equally be driven by users, programs, automated test or batch scripts, and to be developed and tested in isolation from its eventual run-time devices and databases.”

Alistair Cockburn, “Hexagonal architecture

In practice, using a multi-layered, hexagonal architecture with loose coupling needs a lot of experience. I discovered that it’s really hard to convince a team to talk about abstract models. People tend to ask about the details very early, like database, graphics – and there’s nothing wrong about it. People need to imagine the final effect, a real use case – it makes them easier to understand the project.

During project discussions, I suggest not to avoid speaking about implementation. However, we should strive for a transparent and fluent code. It’s worth negotiating some additional time with the client, so that we can create a good project that will save him maintenance costs in the future.