Why I didn’t end up being a database engineer

Once upon a time, looking for new career opportunities, I considered becoming a database engineer or architect. I was pretty successful in optimizing MySQL databases growing up to 65 GB. I liked it. However I noticed almost no one was hiring MySQL database engineers – it’s too simple. Real challenges await those familiar with Oracle and big finance systems, aviation etc. So, I bought a thick book about Oracle and… never used it.

The only profit I got from buying an Oracle guide was another book I was offered in package: “SQL Antipatterns: Avoiding the Pitfalls of Database Programming” by Bill Karwin. It provided me a lot of valuable tips which I still eagerly keep on sharing with my coworkers.

Is database the center of every app?

Anyway, I used to believe that a database is in the center of every application. Most of them process data, right? So I started all projects from conceiving a database model: relations, data types, queries, and so on. I spent a lot of time optimizing data structures.

My approach had several pros. First of all, it was intuitive for other developers because a good database model made them immediately understand the purpose of the system and navigate easily through the rising codebase. Secondly, a good model helped datasets grow significantly (like hundreds of millions of records) without performance loss. Moreover, with correct foreign keys and other constraints, data remained consistent.

I have a funny story about a team of developers which refrained from using foreign key constraints. They used ON DELETE CASCADE clause everywhere and they accidentally wiped away half of the test database. It was such a shock for them they removed all the foreign keys. Soon, data became terribly inconsistent and a lot of bugs surfaced in the apps depending on that database.

Carving a SQL giant

Optimizing databases was cool, but there was one bad practice I really disliked looking at some DBAs work. After I read about SOLID principles, Domain-Driven Design and other stuff, I started to hate SQL stored procedures, especially when they were too long.

I saw some MS SQL devs trying to handle all the business logic with SQL scripts. These systems already had many bugs, and my fellow devs just kept on adding more CASE…IF blocks after requests from the business people. After some time, it was nearly impossible to add or change anything without breaking stuff.

Such situations made me realize that you have to choose the right tools for the job, not the other way around. Of course data is a huge asset for every company, but behavior is important too. Is SQL a right tool to model complex business behaviors?

Getting to know the big picture

For me, data has never been the biggest challenge. It is always the sophisticated business logic that needs to be modelled very carefully. It evolves over time, and we have to make sure all developers (especially the new ones) dig it quickly and flawlessly.

I decided to improve my process modelling skills with OOP and FP, know SOLID, DDD and other widely adopted principles or design patterns. I decided to share business knowledge with all my teammates to make sure they understand the purpose of their work and nature of the surrounding business. Of course SQL knowledge is valuable too, but it’s not enough.

Is Portable Document Format obsolete?

I spent two years on a project which aimed to create highly sophisticated PDF reports for teachers. These automatically generated reports contained bar chars, pie charts, line charts, a lot of tables and everything else. It worked: end users were very happy about the new reports system. But still, something keeps bothering me.

Making PDFs can be tough. Do we still need them?

There are many ways you can generate PDF files. I started from a Java class which calculated every table cell dimensions, position of every text line etc. Then I used PHP libraries like FPDF and TCPDF. The latter had a very simple HTML parsing engine. Then I came across Zend_Pdf and kept using it until I stumbled upon wkhtmltopdf. Yeah, that was a blast! Someone put together a WebKit rendering engine and a PDF printer.

But even wkhtml was not prepared for all the challenges that my client had for me. Dynamic headers and footers. Sophisticated charts which I prepared as SVG files. Errors with dividing tables across multiple pages (with table headers of course). Lack of dynamic font size adjustment depending on the text amount. Trouble with applying custom fonts.

Chrome 59 offered printing to PDF via headless mode. Having recent Chrome rendering engine is great, but I lost some features available in wkhtmltopdf (like these crazy page headers and footers).

You can already see that generating proper PDF files might require a lot of tweaking and quirks. What if we don’t need PDFs at all? In the era of portable devices coming in all shapes and sizes, do we still have to limit our documents in an A4 or any other paper format?

Is PDF always a right choice for our project?

Let’s think about the business needs for a while. The system I mentioned was designed for teachers who got used to print everything on a daily basis. I’m not sure how it works in other countries, but Polish teachers print a lot of stuff. They download a PDF, then print and put on their desks to discuss with pupils and their parents. That’s their habit.

I worked on a similar reporting system designed specifically for parents. The initial business decision was to copy the same well-known PDF system and do some adjustments. I asked my client: hey, do we really need to spend a lot of time fine-tuning PDFs again? Do our end users need PDFs? They have different habits, they have smartphones and tablets. They are not going to print our reports. They want a modern, responsive web app.

That’s how I convinced my client to drop PDFs in the new system. I made development a lot faster and easier. We made a modern frontend layer using ReactJS and everyone was happy with the result.

Do we still need static PDF layouts for invoices, order confirmations, tickets, magazines, official documents, reports?

In Poland, we have electronic train tickets that can be downloaded on a smartphone. It is an ordinary PDF file with all the ticket details. But all we have to do is to show a QR code during ticket control. The same applies to concert tickets. Why use PDF if a simple HTML would be enough?

Further reading

Will PDF files become obsolete in 5-10 years?

There are some limits to the YAGNI principle

Have you heard of YAGNI? The acronym stands for “You Ain’t Gonna Need It“. The original meaning, as Extreme Programming guru Ron Jeffries said, was to “implement things when you actually need them, never when you just foresee that you need them”. But how did this principle evolve over time?

Since the late 90’s we have witnessed a lot of frameworks and tools emerge. We love new shiny tools, don’t we? Sometimes we think that a new framework will solve all our problems. That’s what the “Hype Driven Development” article described.

So then people started to talk: okay, maybe we should care more about the business, not just our fun? Maybe we don’t need so many tools and libraries everyone else is hyped about? Maybe we should not do a big rewrite of our systems every six months after another tech conference?

Don’t do everything by yourself

If using the latest hyped tool for every task is one extreme, then doing everything by yourself is another one. The latter option was proposed by some guy conducting a training on microservices (another hype term by the way) for my company.

I was in a team using PHP and we wanted to know something more about developing proper microservices architecture. We did a lot of analysis of our current system and decided this might be a chance. What we heard during the training was:

  • Our coach would never use Symfony for a microservice because “it’s huge and will cause performance issues”. That was a couple of months before Symfony 4 was released, but the same guy argued that “computing power is cheap and we shouldn’t care”. Umm…
  • Our coach would never use Doctrine for a microservice because “it’s huge and will cause performance issues”.

Instead of showing us some alternatives, the coach asked us to write an example application in vanilla PHP.

By the time the training was conducted, I had already spent years working on home-brew “frameworks”, made by people who did not believe in the popular systems and desperately wanted to do things their own way. Such people tend to leave the company a couple of years later, overwhelmed by all the issues caused by their “ingenious” frameworks.

Pick tools carefully, get to know them well

So you’re afraid of incorporating third-party code into your project? Cool, it means you’re a responsible developer. But before you turn down popular and well-tested solutions, get to know them well. How much can you fine-tune them? How much can they be customized and stripped of unnecessary features?

Back to PHP stuff, Symfony is highly customizable especially since version 4. Symfony and Doctrine tend to cache a lot of things. You can also optimize Composer’s autoloader.

Don’t optimize prematurely

Performance can be easily improved. But business logic can be a real monster.

After a couple of years working on big, sophisticated projects I realized that performance is a secondary issue. The biggest challenge for me has been always dealing with complex business logic, huge number of moving parts dependent on each other, unclear rules etc. You need good tools to model and test that logic.

So what you shortened page loading time from 300 to 100 milliseconds if your teammates do not understand the crazy optimizations you just made? What benefit do you have after forcing your team to use weird tools they will complain about?

You need to balance these things out.

Defensive coding: Null-safe string comparisons

One of the most valuable tips I received why having my Java code reviewed was to change this code:

if (merchant.getSalesChannel().equals("Mexico")) {
    …
}

to this:

if ("Mexico".equals(merchant.getSalesChannel()) {
    …
}

Everything in Java is an object. Thus, a simple string literal immediately creates a new object. We are sure that a literal is not null.

The other part of the equation might surprise us with a null. Who knows what that getter might return? Most likely it is some value from the database. Are we sure that the value was properly retrieved? Are we sure there’s a NOT NULL constraint on the database column? We can never be sure.

So if we want to compare a variable to a string literal, let’s use the null-safe style proposed in the second listing above.

Further reading:

https://stackoverflow.com/questions/43409500/compare-strings-avoiding-nullpointerexception/43409501

https://softwareengineering.stackexchange.com/questions/313673/is-use-abc-equalsmystring-instead-of-mystring-equalsabc-to-avoid-null-p

https://www.javacodegeeks.com/2012/06/avoid-null-pointer-exception-in-java.html

Defensive coding: Make local objects final

In the very first article on defensive coding we talked about avoiding mutability.

Let’s talk about JavaScript for a while. Every modern tutorial says that you can instantiate new objects either with let or const. You use let if you intend to create a variable (like with an obsolete var keyword), or you can use const to indicate that an object – once initialized – should remain constant. This gives developers great clarity and saves them from mistaken mutations.

What about Java? You don’t have a const keyword, but you have final. I love using it and many developers in my current project also use final wherever feasible. You can mark properties, local objects and method arguments as final. There are very rare cases when we need to mutate an object, so must of the time I end up having final everywhere:

public BigDecimal getTotalPrice(final Product product,
                                final BigInteger quantity) {
    final BigDecimal totalNetPrice = product
             .getUnitPrice()
             .multiply(quantity);

    return totalNetPrice.multiply(product.getTaxRate());
}

Why use final everywhere? It can look strange in the beginning, but it protects us from bugs. How many times have you seen robust methods spanning multiple screens, with vague and similar variable names? How many times have you mistaken foo, oldFoo, databaseFoo, foo1 etc.?

In PHP, the final keyword can be applied only to classes. You can’t create local constants, only class constants with const keyword. It’s even worse that you don’t clearly distingish first object instantiation from subsequent instantations using the same variable name. This code is perfectly valid in PHP:

$builder = new ProductBuilder();
$builder = new UserBuilder();

Defensive coding: Final properties and proper autowiring in Spring

How to get rid of “Field injection is not recommended” warning (and what’s the problem)?

I work in a project where most dependencies in Spring are declared this way:

@Component
class DocumentsController {
    @Autowired
    private UsersRepository users;

    @Autowired
    private DocumentsRepository documents;

    @Autowired
    private PdfGenerator pdfGenerator;
}

For every @Autowired annotation above, my favorite IDE complains that “field injection is not recommended”. What does it mean?

It’s a nasty hack. I believe every class should be explicit about its dependencies, especially the required ones. They should go into a constructor. Anyone trying to instantiate above class manually should immediately see what is needed.

Dependencies should not be mutable. In the example above, you cannot use final for properties because your compiler will complain that the objects are not initialized. The compiler doesn’t know the Spring magic behind instantation of these objects. This means that someone can accidentally overwrite the property.

Let’s refactor that code:

@Component
final class DocumentsController {
    private final UsersRepository users;
    private final DocumentsRepository documents;
    private final PdfGenerator pdfGenerator;

    @Autowired
    public DocumentsController(final UsersRepository users,
                               final DocumentsRepository documents,
                               final PdfGenerator pdfGenerator) {
        this.users = users;
        this.documents = documents;
        this.pdfGenerator = pdfGenerator;
    }
}

If you don’t want to write boilerplate code, you can use Lombok’s @RequiredArgsConstructor to create the constructor automatically. Still, you benefit from using final and having explicit dependencies (but this requires having a Lombok plugin for your IDE).

Defensive coding: Avoiding mutability and side effects

Are you tired of fixing the same bugs on and on in a huge system developed by a multitude of developers? I guess it’s time to introduce some practices of Defensive Programming. It is an approach to improve software quality by reducing number of possible bugs and mistakes and making the code more predictable and comprehensible.

Before we continue, I advise that you get familiar with NASA’s Power of 10 Rules, designed for developing

You can also read about poka-yoke approach to design things in a way that prevents possible misuses. An example from a real world is a SIM card that can be inserted in only one position. Now think how many times, as a developer, you were misled by someone else’s code? How many times have you used wrong function, wrong variable, wrong argument, wrong format?

Writing reliable code is an art of avoiding mistakes. Let’s see how to do that.

Avoid mutability and side effects

How many variables are there in your code? How many of them are not used anymore? How many of them have misleading names?

Every variable can affect the behavior of your program (well, that’s what they’re designed for). Variables are meant to have different values over runtime. Are you sure you’re controlling all of them properly?

Here are some basic rules for variable handling:

  1. Do you really need a variable? Maybe a constant will be enough? In JavaScript, use const instead of let. In Java, use final to indicate the value can be assigned only once.
  2. Use the strictest scope possible. Avoid global variables.
  3. Avoid modifying object state and global state if not necessary. A function that modifies anything outside the scope of itself is creating a side effect. Side effects often cause unpredictable bugs and make testing more difficult. Even if you just poke the objects passed as function arguments, it is a side effect. At least be aware of the consequences.
  4. For simple values like date/time, use immutable value objects. For example, if you want to add 2 months to an existing date, create a new result object instead of modifying the existing one.
  5. Use precise and meaningful naming, not just i, j, k or temp. How can other developer know which variable is responsible for what? On the other hand, take context into account. Avoid lengthy names with unnecessary prefixes, like bigFancyModuleUser, bigFancyModuleProduct etc. If these variables exist only in the scope of BigFancyModule package, skip the prefix.

How I optimized a process from 35 to 5 hours

Most of my day job isn’t fascinating. Yet another controller, service, test, and so on. I spent a lof of time doing repetitive tasks and slowly gaining more knowledge about the system I’m working on. However, having slow and steady pace can eventually reward you with an opportunity to make a really great improvement. It happened to me, twice.

I spent four years maintaining a project with a 65 GB MySQL database and hundreds of millions of records. In the beginning, the system seemed to be very complex. It contained lots of legacy code and many classes turned out to be obsolete. I needed some time to raise my confidence with this project. After two years, I reduced the database size to 15 gigabytes without any data loss. Of course during my work, the database gained another millions of records, but that didn’t stop me from doing a stunning optimization.

It wasn’t a single database migration, but a series of small ones. It took me many months to come up with all the optimizations, sometimes subtle – but with 300 millions records, every byte counted. Database schema changes required also application code changes and I did not want to make too big PRs. Moreover, I couldn’t just execute a migration on a 60 GB table whenever I wanted. I had to agree with the Product Owner on a downtime. And, of course, I had to prepare backups and a rollback strategy.

Then I jumped to work on an advertising platform which had a complex invoicing system. Every night, a cron job was run to create and send PDF invoices. The process was supposed to finish in the morning, but it didn’t. People did not get their documents in time. I discovered that a single process can run up to 35 hours, even if just a few documents were made.

Again, I had to do several boring maintenance tasks before I had the courage to optimize the complex invoicing process. After gaining some basic system knowledge I noticed that the cron job did not have any tests. Every change required manual tests. So I spent some time writing unit and integration tests which helped me understand the process even more.

When I was ready to introduce changes, I talked to the Product Owner and he agreed to include that work in the upcoming sprint. I needed two weeks to do necessary measurements and experiments. In the end, I successfully deployed my changes and the process shrinked from 35 to 7 hours. I removed a lot of redundant database queries by simply verifying the boolean logic and the control flow. Are you aware of the way boolean expressions are evaluated? Knowing such nuts and bolts might really reward you.

How to conduct a successful process optimization

  1. Know the details of the business logic and code details of the project you are working on. To do so, maintain a steady workflow. Don’t be afraid of boring tasks – you have to start somewhere.
  2. Test the current behavior that you plan to optimize. Write automatic tests or at least prepare manual scenarios to cover as much cases as possible (not just happy paths). This will expand your project knowledge even further.
  3. Measure all steps of the current process. Which step is the slowest and why? How much time on average it takes to process a single entity? You need measures to know if you’re making any progress.
  4. Introduce changes in the code and verify them on your local or test environment.
  5. Gather feedback during code review. Maybe someone will notice dangerous changes you overlooked.
  6. Analyze what is needed for deployment. Will it require downtime? Any database migrations? Are there any other applications depending on your module/service/app?
  7. Practice deployment in a test or pre-production environment.
  8. Discuss the deployment time with the team and stakeholders.
  9. Good luck! Go with the deployment 🙂

Mixing office and remote workers in one organization

Remote work is challenging. People working remotely need perfect communication skills and discipline because no one is watching over their shoulder. However, the real struggle starts when we try to combine office and remote employees. What problems are we going to face and how can we improve the situation?

I was tired of working in an office, so switching to a remote job was a big relief to me. However, I still had to cooperate with people sitting in an office and it turned out to be quite a challenge for all of us. Luckily, with some understanding from everyone in the team, things improved quickly.

Let others know where and who you are

Your coworkers should know if you’re available or not. Set a status which says “In the office” or “Working remotely 9-5”. Use “Away” when you have a break and “Do not disturb” when you need some peace.

At some point, my organization forced everyone to set their photos as profile pictures on Slack. All the funny cats as avatars are now gone. It’s a good way to integrate people, especially when remote guys visit the office from time to time.

Improve conference calls between office and remote people

We often have calls where one group of people is sitting at the office and another group is connecting remotely. The biggest challenge is to create equal participation opportunities for everyone in the team.

Both office and remote people must have a good connection and a good microphone, so everyone can understand what other people are saying. Remote folks usually have headsets; please check their sound quality upfront! If they sound like an old telephone, buy something better.

The best table setup for a call with remote people. Every person maintains the same distance from a microphone. A webcam captures everyone at the table, so that remote participants see exactly what’s happening.

The office group can have a shared microphone on the table. You can find some good products with an omnidirectional mic and an integrated speaker for around $100. Quality matters even more because people are going to sit in some distance from the microphone, so remote guys will hear more room reflections. Sit around the microphone in an equal distance, so everyone can be heard equally loud.

When the office team joins a meeting, they share one user account. Remote people do not know who exactly is present in the room. The solution is simple: turn the camera on! The best solution is to have an external camera with an overall view of the conference room. If you don’t have it, just rotate a laptop whenever someone else is starting to speak.

Any new people should introduce themselves, like “Hi, I’m Mark, I’m resposible for X and I joined the meeting because …”

It’s good to know who’s sitting with us and why, and it’s nice to see people smiling, so remote people can launch their cameras too.

Share anything valuable hanging on the office walls

Sometimes people at the office find it convenient to draw something on the wall, or stick some cards here and there. Remote workers do not see these walls. You need to at least share a picture of any diagrams you made on that wall. Make sure remote folks are somehow able to contribute to those drawings.

The same goes with any printed announcements, like “Hey, we’re having a party tomorrow”. Of course if remote people can relate to them. You don’t have to share information about a broken coffee machine, for instance.

Meet in person from time to time

You should meet and have some fun together. The team’s mood is much better when you share memories from trips and parties. An organization can facilitate this by organizing different events, like trainings, conferences, lightning talks, etc. Of course you can also have your own initiatives, even just having pizza and beer.

Mixing office and remote workers can bring a lot of fun. It increases diversity because a company does not limit itself to hiring only the people preferring a specific location for work. However, it takes some practice to do it right and get rid of any communication obstacles.

Offboarding: How to quit the job gracefully

Recently I shared my thoughts on preparing an efficient onboarding process. Unfortunately, sooner or later people quit. This is also something our dev team should prepare for.

Most of the time, employees and contractors have a notice period which lasts usually from two weeks to three months. This might seem like enough time to transfer knowledge and duties, but my experience shows that it’s never enough.

Prepare for the offboarding period

When an experienced person brings their resignation letter, often a panic mode starts within the team. Suddenly everyone wants something and the person who quits has a really busy time.

Sometimes, ambition strikes. Knowing that the days of our current job are counted, we can fall into a mania of fixing everything. This can bring too much chaos into the team and the resulting value is not worth it. If you suddenly decide to update all libraries in the system, you can break a lot of things and drag people away from their current tasks.

There is a simple way to avoid such an “offboarding rush”. Your team should perform regular documentation updates, automate manual tasks, write tests and don’t wait too long with refactoring bad code. This should be a routine, not an exception.

You should avoid the so-called knowledge silos. It’s inevitable that developers in your team will specialize in different parts of the system, and that’s ok. However, you should faciliate information exchange when needed. This includes proper knowledge base, commit messages, branch names, issue descriptions, chat groups/channels etc.

If you create a hack or workaround somewhere, under time pressure, then at least describe it in a visible place. Other people will be aware of that hack’s existence when you’re not around. I’ve witnessed a lot of situations when experienced people left a lot of hacks even on production servers and then just quit. Such hacks break in the least appropriate time, like a peak of a sales season.

The same goes with every manual task done by a particular person, like manual deployment, setting up repositories and so on. These should be at least documented, so that someone can take over such duties. The best way is to automate as much as possible in a clean and descriptive manner.

Easy to say, huh? But developers often postpone maintenance tasks which are not a part of the client’s requirements. If stakeholders don’t yell at you when you don’t ship unit tests on time, then… you postpone them. They won’t complain about your internal documentation either. Stakeholders aren’t interested if your workplace is clean or dirty if they get a working product. But things can break soon if you don’t clean your mess!

Make a good use of the time that’s left

When the date of somebody’s leave is clear, the whole team should decide how to spend that time. You can organize knowledge sharing meetings or pair programming sessions. You can sit and discuss refactoring ideas that could be realized in the future. An experienced developer might share thoughts about further product development and possible problems.

Focus on the most important and valuable activites. As I said before, don’t try to fix everything at once. Also, a person that is about to quit should not do any more tasks on his or her own. The offboarding time should be spent on helping the team.

Be honest but polite

Quitting a job is often preceded by months or even years of frustration. There can be a lot of reasons for that, and when you finally make the decision to leave, you probably think it’s the only solution left.

In the first years of my career I made a mistake of not being honest about my work expectations, like salary or duties. Employment is all about two parties getting along with each other. You shouldn’t be afraid to talk honestly to your boss.

However, don’t burn bridges. You never know what happens in the future. Maybe you’ll meet your boss again in another company? Maybe some day you’ll receive an interesting offer from your boss?

Also, be polite and do not spread your frustration everywhere. Your team might still contain new, enthusiastic people. Don’t break their motivation.

Be responsible

The way people quit a job tells a lot about their social skills and emotional intelligence. Also, the way a team deals with people leaving the workplace speaks a lot about its practices. We all need to be responsible, mature and polite, so we can avoid sudden stress caused by someone quitting.