6 Steps to Effective Developer’s Onboarding

Who’s responsible for building a development team? A team leader? Scrum master? Senior engineers? I strongly believe that every team member contributes to its work culture. I also believe that one of the best benchmarks of a team’s performance is the way new developers are introduced.

Hiring a new developer is a big investment. Once a candidate passes all the recruitment steps and walks into our room, we want him or her to start proper work as soon as possible, so the investment would start returning. Most of the time we already know why we want that person even before he or she is hired. Perhaps we have certain expectations. Maybe an experienced team member just left, or we need more resources to deliver new features and bugfixes?

On the other hand, sometimes we are the ones who change jobs and need to dive into an unknown environment. We want to show off and bring value as early as possible to prove we are worth the time and money spent on hiring us.

Let’s see what we can do to improve the onboarding process in both scenarios.

Know the new employee (or employer)

A good onboarding process should use all the data gathered during recruitment. Our candidate already answered a lot of questions about his or her experience, favorite tools, career goals, and so on. Maybe he or she already solved a technical task or wrote a test prepared by our company? All of these data should be an input to the invididual onboarding process.

If you’re joining a company, you should also gather as much informations as possible. Help improving the onboarding by asking important questions during recruitment: what tools do you use, what is your work culture, how an average day looks like, and so on

If the company’s HR department or agency did not provide you enough details, ask for them. You should know as much as possible about your future coworker or workplace.

Choose a mentor

It’s good to have a friend in a new workplace. Most people cannot memorize the names and functions of everyone around in the first day. Onboarding is easier when a new employee have one person to ask most questions, like “where is this or that in the office” or “who should I ping about my Git access”.

But a mentor is someone more than that. A mentor helps setting career goals and evaluating them. A mentor tries to compromise company goals with the individual goals of an employee. A mentor shares experience and helps developing a position in the organization. Of course mentoring takes some time and attention away from pure coding. But programming is a team sport, right?

A mentor does not have to be that one senior developer who’s been here for ages. Even regular developers who spent only a couple of months in the team can guide new folks. A mentor should not pretend to know everything. It’s all about willing to work together and help out.

It’s good to spread the mentoring duties across different developers, so we don’t end up with one senior trying to guide the whole team. A bottleneck is made when all the questions and decisions go through a single person. Your team should learn self-organization.

If you’re about to join a team, ask who is going to help you in your first days. Maybe you could meet that person earlier and have a chat?

Prepare an onboarding plan

Don’t improvise. Every developer needs some basic stuff to work: a computer, a desk, a comfortable chair, internet and/or intranet connection, software licenses, e-mail account, repository and knowledge base accesses, and so on. Arrange them in advance.

Plan at least the first day which probably will consist of introducing a new member into the team, meeting some people, talking about the project and team habits and maybe visiting HR department for some additional paperwork.

Think about the first tasks a new member could be assigned to. A simple label change or a trivial bugfix would be the best. In the beginning, everyone has to know the workflow and team practices. A fresh developer will surely ask these questions:

  • How do I clone repositories? Do we use SSH keys?
  • How do I create my branch? How do I name it?
  • Should I include an issue number in the commit message?
  • Should I rebase my branch before pushing?
  • How do I launch tests and check their results?
  • Who should I ask for a code review?

Onboarding will take some time for sure, so remember to reserve it in your team’s calendar.

Review your documentation and communication practices

Everything that makes the work of experienced developers easier will also help the newcomers. It’s good to have some basic piece of documentation we can refer to. Check if it’s up-to-date and if it contains at least basic information which can help fresh people.

I’m not a fan of robust, official company documents. Often a simple glossary of terms and some architecture diagrams should be enough for newcomers to understand everyday conversations.

Consider improving your team’s communication. How do people communicate? Are there knowledge silos? Do you use public channels on Slack or maybe all decisions are secretly made in direct messages? How do you use an issue tracker? Do issues have precise descriptions?

Introduce a new member to the team, the project and the company

Most people cannot memorize more than a few names in the first day, but you can make it a bit easier. I once joined a team where everyone had different nicknames and avatars across Slack and GitHub. It’s good to have them unified. Later, my employer decided that everyone should put his or her photo as an avatar, instead of funny cats and other weird stuff. This made recognizing people in the hallway a lot easier.

A great way to increase people’s motivation is to let them know the details of the business they work for. Tell them a bit about the company and product history. Provide a business context and describe the reason this team exists. Maybe a new person could talk to someone else than just fellow developers? Do a product demo.

If you just joined a new team, be curious! Don’t hesitate to ask questions.

Establish goals and evaluate them

Our work should serve a purpose and every team member should feel it. Set individual SMART goals for upcoming 3-6 months, then provide support and inspect progress.

I believe that everyone can contribute to the organization from the first day. Discuss what specific value a person can bring in the beginning. I can think of:

  • Update documentation, especially the part about setting the development environment.
  • Write tests. In one project I started from writing numerous unit and integration tests for a module refactored by my experienced colleague. We cooperated closely and soon I started finding bugs in his code.
  • Review pull requests. Why not? Good questions from a newbie might help finding weak sides of PRs made by experienced developers. It can be also a nice variation of rubber duck debugging.
  • Pair programming sessions whenever it’s feasible to split work for two people.

These are some basic goals, but you should also agree on something more related to the specific product(s) your team is developing.

Wrapping up

Onboarding is an exciting process and has a massive effect on the team’s and each individual’s performance. It should be well-planned and conducted in a friendly, calm atmosphere. This is the time when people make new connections which will be crucial for their later work. It is also a great moment to review and improve team’s practices.

Picking a PHP tool to read and manipulate PDF files

In the previous article I described several tools that can be used together with PHP to create PDF files. Back then, the choice was not easy and we had a lot of criteria to consider while picking the best tool. Today we will browse possibilities to read and edit existing PDF files.

Native PHP libraries

Again, we will start from checking if there are any PHP libraries to manipulate PDF files without depending on external binary tools.

pdfparser

There is an interesting library called smalot/pdfparser. It has almost 1000 stars on GitHub. It utilizes TCPDF Parser to parse a PDF file into an array of document objects which is further processed to get what we need.

The library is convenient as it supports both parsing an existing file or a string with PDF data. It allows you to extract metadata and plain text from a document. You can test the library at its demo page.

The problem is that Sebastien’s library is based on old TCPDF version 6 parser which some day is going to be replaced by a newer rewrite called tc-lib-pdf-parser. However, that new parser is still under development and Sebastien’s is aware of its existence.

smalot/pdfparser has commercial support from Actualys.

FPDI

I got familiar with this library when I received a bug report for a watermarking module in some e-book system. The module received a PDF, parsed it using FPDI, generated a watermark with FPDF and stamped it over all pages.

The problem is that the free version of FPDI supports only PDF version 1.4 and below. To support higher document versions, you have to buy a full library. And that’s what the bug report was about. We decided to switch to another tool, pdftk, which is described below.

Command-line tools

The first command-line tool I played with was pdftk. I used it to join separate documents into one, apply watermarks and extract basic data, like a number of pages. It supports all PDF formats unlike FPDI library. The only thing that’s missing is a text extraction feature.

The need to extract plain text from a document led me to the Apache PDFBox library. It is written in Java and, as I described before, it offers some very nice features. However, in the PHP world we can only access a CLI wrapper for that library which has a limited set of options.

Later I discovered the Poppler library, which is said to fully support the ISO 32000-1 standard for PDF. This C++ library can be accessed via dedicated CLI tools – poppler-utils, which we can run from PHP. For example, the pdftotext tool gives a lot of control over the plain text dump – you can even preserve a proper document layout while rendering, or crop the document to a specified region. Also, pdfinfo provides comprehensive information about a file, like page format, encryption type etc. You can use it to extract JavaScript too.

Sometimes you might want to create a PNG or JPEG screenshot of a document. You can do it with pdftocairo from Poppler, or use ImageMagick’s convert.

Wrappers

For pdftk, check out this library: mikehaertl/php-pdftk.

PDFBox CLI can be accessed via schmengler/PdfBox.

Imagemagick and Ghostscript are the basis for spatie/pdf-to-image wrapper.

Poppler has several PHP wrapper libraries:

  • spatie/pdf-to-text only allows to extract text from a PDF. It requires an input PDF to exist in the file system. The library does not wrap additional input arguments, so you have to specify them manually.
  • ncjoes/poppler-php: a library supposed to wrap all poppler-utils, but at the moment pdftotext is still unsupported. Also, this library is not very convenient as it forces you to choose an output directory for a file (it does not return processed data as string).

In fact, these two libraries are wrappers to a wrapper, since poppler-utils are just a collection of CLI wrappers for the Poppler C++ library 😉

Which to pick? Native or CLI?

There are a couple of basic considerations.

Native PHP libraries should work independently from the host environment. They are a lot easier to set up and update. The only depedency tool you use is Composer.

CLI tools, especially these written in C/C++, might be faster and use less memory. However I don’t have strict evidence at the moment. Maybe all the optimizations that came with PHP 7 will make this point obsolete. Also, I believe that C/C++ tools have a wider audience and thus might receive more community support.

You should pick a tool that’s best for your specific requirements. Most tools will do a decent job while simply rendering an unencrypted PDF to an image or some plain text. But if you need to have more control on the output file structure or you want to process encrypted documents, poppler-utils will be a good choice.

Sometimes it occurs to me that many developers are just reinventing the wheel, especially when it comes to a multitude of PDF processing libraries for PHP. The Portable Document Format has almost seven hundred pages of specification. We are all struggling with the same processing issues. That’s why I rather prefer to choose the best tools in different technologies and connect them with interfaces rather than doggedly sticking to a single technology.

Check out the List of PDF software at Wikipedia.

Picking a PHP tool to generate PDFs

I spent a lot of time working with different tools to generate PDF files, mainly invoices and reports. Some of these documents were really sophisticated, including multi-page tables, colorful charts, headers and footers.

I know how hard it is to choose between a multitude of libraries and tools, especially when we need to do a non-trivial job. There is no silver bullet; some tools are better for certain jobs and not so good for other jobs. I will try to sum up what I’ve learned through the years.

Two ways of creating a PDF file

A PDF file contains a set of objects which make a document, like pieces of text, images, lines, forms, fonts, and so on. So creating a PDF is an act of putting all these pieces together in a proper order and layout. Most objects utilize a subset of PostScript commands, so you can even write them in your text editor.

One way is to create these objects “by hand”: we add every line of text separately. We draw all tables manually, calculating cell widths and spacings on our own. We must know when to split longer contents into multiple pages. This approach requires a lot of manual work and very good programming skills, so we don’t end up with a spaghetti code where it is hard to find any meaningful logic between all the drawing commands.

Another way is to convert one document, for example HTML, LaTeX or PostScript into PDF.

We used LaTeX for an education app which allowed composing tests for students from existing exercises prepared by professionals. Since LaTeX was a primary tool for our editors, it was natural for us to convert their scripts straight to PDF.

Converting HTML to PDF is way more complex as today’s web standards are having more and more features, just to mention CSS Flexbox or Grid layouts. Let’s see what we can do.

Native PHP libraries

My first experience was with native PHP libraries where you had to do most things by hand, like placing text in proper positions line by line, drawing rectangles, calculating table cells, and so on. It was quite fun at the time, but creating more robust documents turned out to be very hard. We used FPDF and ZendPdf libraries (the latter is discontinued).

At some point, I ended up maintaining multiple-page, sophisticated school reports with tables and charts rendered by ZendPdf. Business wanted to add even more types of reports. I decided to rewrite all reports as HTML documents with stylesheets and then try to make PDFs from that.

There are three PHP libraries capable of parsing HTML/CSS and transforming that to PDF:

Rendering HTML and CSS certainly isn’t easy. Modern browser engines are huge projects and I can’t imagine a fully functional rendering engine written in pure PHP. So you cannot expect these libraries to provide the same output you’re seeing in Firefox or Chrome. However, for simple layouts and formatting they should be enough. Plus is that you still do not depend on any external tools – just plain PHP!

To give you some idea of what to expect from above libraries, I compiled a comparison of an invoice renderings. These three pictures are made from the same HTML 5 source which utilizes CSS Flexbox to position “Seller” and “Buyer” sections next to each other. It has also some table formatting:

Google Chrome (reference image)
TCPDF

mPDF
Dompdf

As you can see, none of the PHP libraries understood CSS Flexbox. mPDF and TCPDF had some problems with painting the table. Dompdf performed the best and I’m pretty sure that making the “Seller” and “Buyer” sections the old-school way, like float or <table> would be enough to have a proper result.

External CLI tools

Native PHP solutions were not enough for me, so I decided to use an external tool backed by a fully functional, WebKit rendering engine. My employer was already using wkhtmltopdf which supports everything I needed: SVG images, multi-page tables, headers and footers with page numbers and section names, automatic bookmarks. Having old reports rewritten to HTML and CSS, I was able to implement all the new features requested by the business.

wkhtmltopdf certainly isn’t bug-free; for example, I had some issues with repeating table headers on consecutive pages. Also, upgrading from 0.12.3 to 0.12.4 broke my document layout which used dynamic headers and footers, so I had to go back to the old version.

Then I got familiar with PhantomJS, which was used mainly to conduct automatic browser tests in a headless mode (without the browser window). It could also capture PNG and PDF screenshots. PhantomJS used a newer version of the WebKit engine. However, the project is suspended now.

Almost a year before the suspension of PhantomJS, Google announced that Chrome can run in a headless mode from version 59. This means you can utilize the latest Blink rendering engine to convert HTML/CSS to PDF from your command line. This is perfect for rendering really complex documents utilizing latest web standards. The document looks exactly the same in your browser and in the final PDF file which makes development a lot easier.

Connecting PHP with external tools

The easiest way would be to execute an external tool as a shell command. You can do it with PHP functions like shell_exec or proc_open, but it’s not very convenient.

I recommend using symfony/process library and utilize standard streams whenever applicable. A process should accept input HTML through STDIN and send the resulting PDF via STDOUT. It can also produce some errors through STDERR. It might turn out that you won’t need any temporary files to do the job.

There are also several wrapper libraries, like phpwkhtmltopdf or KnpLabs/snappy.

For Chrome, consider using Browserless. You can choose between a free Docker image with pre-configured Chrome with dependencies, or a paid SaaS platform to convert your HTMLs to PDF. With the Docker image, it is really easy to send HTML and receive PDF via HTTP.

Conclusion

There is a wide choice of PHP libraries and external tools which can be used to dynamically create PDF files. You should choose a combination which suits your business needs. For simple documents, you don’t need a complex rendering engine. Save disk space, CPU and RAM!

Please also remember that many tools are developed by the Open Source community and receive little commercial support. They can be abandoned at any time or they might not support newest PHP version from day one (which can impede migrating the rest of your app). And your dependencies have dependencies too, so take a look at composer.json when picking a library.

And if your favorite Open Source tool does not do everything you need properly – maybe try contributing? It’s a community, after all.

Testing PDF documents

I’ve been wondering for some time if PDF is still a valid format. It’s “portable”, of course, but not in today’s meaning – it’s clearly not responsive. Like a fixed piece of paper transformed into a file. However, PDF still has many important use cases like storing invoices, reports or tickets. I spent a couple of years working on sophisticated PDF reports, and this year I even tried to test a process of generating invoices in some ad exchange system. I really wanted this system to be rock solid.

Of course there is no point in comparing a binary PDF file to an expected value. You can’t catch the exact differences in case of an error. I could create a PNG screenshot and compare it to the template, but I was a bit worried about the readability of such diff. A third way would be to verify the source HTML document used to render a PDF – but in fact, I was not interested in markup, but in an output data that landed inside a PDF.

Following my friend’s advice, I used another tool called Apache PDFBox. This robust library allows performing different operations on PDF documents: creating, merging, splitting, signing, filling forms etc. We decided to extract plain text from a file. It’s like we used a Select all command, copy and paste the text into Notepad.

PDDocument pdf = PDDocument.load(content);
PDFTextStripper stripper = new PDFTextStripper();
stripper.setAddMoreFormatting(true);
stripper.setSortByPosition(true);
stripper.setStartPage(0);
stripper.setEndPage(pdf.getNumberOfPages());
String plainText = stripper.getText(pdf);
pdf.close();

assertThat(plainText).isEqualTo(expected);

A PDF document consists of blocks which can be ordered in a way that we did not really expect. Anyone who tried copying a table from PDF to Notepad experienced this. Luckily, PDFBox tries to help us organizing the blocks and formatting the plain text dump.

We made a lot of test scenarios using the above solution and they did a really good job catching all the little bugs in the data. It was crucial to detect any mistakes because our system was preparing financial documents. Moreover, the test reports were very readable.

The only problem with the above method is that it does not test the layout correctness. To achieve that, we could extract only specified regions from the document. In such case we assume that a rectangle with x1,y1,x2,y2 coordinates contain, for example, customer’s data:

Rectangle2D region = new Rectangle2D.Double(x1, y1, x2 - x1, y2 - y1);
PDDocument pdf = PDDocument.load(content);
PDFTextStripperByArea stripper = new PDFTextStripperByArea();
stripper.addRegion("Contact region", region);
stripper.extractRegions(pdf.getPage(0));
String plainText = stripper.getTextForRegion("Contact region");
pdf.close();

assertThat(plainText).isEqualTo(expected);

Java: Integer or int?

When I first saw primitive types like int or boolean mixed up with classes, I was very tempted to convert all primitives into IntegerBoolean and so on to maintain a clear coding style. But reading articles on the Internet and IntelliJ hints stopped me from doing such a stupid thing.

I read this wonderful rule of thumb:

Don’t create unnecessary objects.

Primivites always occupy less memory than objects, which was clearly described in this article on primitives. Maybe we don’t usually care about the memory usage, but if we process big amounts of data, every byte can count. Moreover, accessing primitives is faster because they are stored on the stack, not on the heap.

It’s also important to remember that a variable pointing to an Integer or Boolean object can be null. A primitive can’t. This difference matters for example when you retrieve data from an SQL database which allows NULLs. If you try to assign null to a primitive, you’ll get a Null Pointer Exception. That’s a bug I noticed and fixed in a production system. So I always try to restrict the possibility to use NULLs in the database and in the code unless null has a real business meaning.

NPE: Converting a list of objects to a map

Like every Java developer, I had a fair amount of Null Pointer Exceptions in my career. That’s why I decided to describe real-world examples I had to fix.

Today I will show you how to get an NPE while retrieving records from a database and creating a map based on these records, using streams and collectors introduced in Java 8.

Let’s say we have a Setting class:

public class Setting {
private final int accountId;
private final String type;
private final String value;

public Setting(int accountId, String type, String value) {
this.accountId = accountId;
this.type = type;
this.value = value;
}

public String type() {
return type;
}

public String value() {
return value;
}
}

Now, in another part of the system we will retrieve Setting objects from a database. We haven’t noticed that the SQL table allows NULL in the value column. Let’s pretend our list of settings looks like that:

List<Setting> settings = new ArrayList<>();
settings.add(new Setting(1, "site_title", "My blog"));
settings.add(new Setting(1, "site_description", "A place with fresh ideas"));
settings.add(new Setting(1, "site_copyright", null));

We want to easily access settings by names, so we create a map using a stream:Ponieważ chcemy mieć łatwy dostęp do ustawień o konkretnej nazwie, tworzymy mapę za pomocą strumienia:

Map<String, String> settingsMap = settings
.stream()
.collect(Collectors.toMap(Setting::type, Setting::value));

Bang! We’ve got a NPE here because one of the values used to create a map is null. It turns out that Collectors.toMap() does not handle NULLs.

Let’s think: do we really need a NULL in this case? I guess not. A NULL value means “no data” or “unknown value”, which is the same as if the setting did not exist. We should set the value column as NOT NULL. We can additionally filter the settings list:

Map<String, String> settingsMap = settings
.stream()
.filter(setting -> null != setting.value())
.collect(Collectors.toMap(Setting::type, Setting::value));

A NULL would make sense for example if we wanted to create a map between people’s names and their ages. If we don’t know someone’s age, then a null seems reasonable. We would have then to create a map in a different way.

You can have a further read on StackOverflow. Alternative ways to convert a list to a map can be found on Baeldung.

Do we still need recruitment agencies?

Article originally published on dev.to

As a candidate looking for jobs, I’ve never cooperated with any recruitment agencies. But as a senior developer responsible for tech interviews, I was forced to work with some HR companies and I had some really weird situations because of that. Sometimes it’s annoying, sometimes it just makes me laugh. Anyway, due to my bad experiences it’s hard for me to find any reason to pay commission to an HR/talent agency. I’ve always disliked “men-in-the-middle”. Yet, many companies seek talents through agencies in addition to their own headhunting struggles.

Let me briefly explain a standard recruitment process we developed in our company. We didn’t have an HR department, so the whole process was led by me and CTO. First, I prepared a job offer which would reflect our current needs. Then, the CTO posted that offer on various platforms. We received different resumes and reviewed them. If you did not have a meaningful GitHub profile, we usually asked to do our simple recruitment task: write a PHP shopping cart implementation for existing unit tests; kind of TDD, most people solved it under 4 hours. We also sent some technical questions to see if we don’t waste your (and our) time meeting you at the office. Especially if you lived several hundred miles away. If we liked your answers and the solution to our task, we would invite you to a personal meeting (around 1.5 hour) consisting of a soft interview and tech interview. Finally, you could receive an offer from us… and turn it down if you did not like it.

Simple, isn’t it? Just two parties making business with each other. We are the client and we buy services that you provide. We negotiate the deal and if everyone’s happy, just tell us when you can start. The description I mentioned is generic; we tried to approach every candidate individually according to the provided materials, proven experience and a lot of other factors. You didn’t have to solve our coding task if you found another way to show off. Fun fact: we’ve never cared about your formal education.

Unfortunately the resumes we received from job board users were not enough and because the company did not want to invest its money in greater headhunting endeavors like going to conferences or recruiting a full-time HR guy, well… it decided to use some help from external HR agencies. We would present our needs to the agency and it was supposed to find right people for the job. We received a written recommendation for every candidate. After successfully hiring an employee, the company paid commission to the agency.

How recruiting with agencies went wrong

I wouldn’t moan if those companies did their job well, but what I experienced instead was:

  1. The recommendations were generic and useless. Every candidate was described as a positive, pro-active team player who aims for personal development. Blah, blah, blah. If you stripped that BS, you would find out that you’re dealing with a mediocre coder with a boring portfolio. Both the company and the candidate wasted time talking to the agency because one way or another, we had to figure out most things for ourselves. The most interesting facts were those off the record.
  2. One agency used to save these recommendations as DOCX files. I work on Ubuntu and LibreOffice did not properly render those files, throwing some contents away from the page. I had to switch to Windows, launch Microsoft Word, prepare myself a PDF file and switch back. Eventually, the HR guys learned how to create PDF themselves. What a relief.
  3. The same agency stripped my recruitment task. It originally was a GitHub repo which consisted of some important files in the root directory (composer.json, docker-compose.yml, phpunit.xml.dist and – most important – README.md!) and a tests directory. You had to write a simple implementation which passed all of my tests. I surprisingly found out that all candidates sent by that agency rewrote composer.json on their own. I asked them: “Why would you do that? The whole autoloader was defined there!.” It turned out that the agency sent the candidates only the tests directory. They did not send the full repo, of course – the candidate could find out what company made that task. The target company’s name is top secret in the beginning – and I recklessly used it as a vendor part of PSR-4 namespaces.
  4. The situation with missing composer.json involved also one candidate which was so tired of endless meetings with an agency that – after learning what company is he applying to – he declined to cooperate with the agency and sent his resume directly to us. I didn’t know that story and I was surprised to see that he already solved our task.
  5. Speaking of endless meetings – that’s the thing I always hear if someone decides to share his or her experience with an HR agency. When you see a fancy job ad like “For our client, a market leader…” and you send your resume, you’re invited to an entry interview in the agency HQ. Then you receive our task to do at home. Then they invite you to another meeting… but not with us! If that meeting succeeds, you eventually make your way to our facility. You go there and probably hear the same questions you’ve already answered in the previous meetings, because the incompetent agency did not properly sum up your answers and we don’t have enough data about you.
  6. Sometimes we don’t receive a full resume, but only a solved task (kind of a blind recruitment). Sometimes we receive a resume with a surname washed out. Sometimes we receive only the recommendation and not the original resume.
  7. Sometimes a candidate is forced to visit the target company together with a guy from HR, who ensures that we don’t make a deal behind their back and we don’t abuse you. To me it’s like coming to an interview with your dad! He’s a man-in-the-middle. It’s not a deal between an employee and an employer. Fortunately, your “dad” stays only for a “soft interview” and he walks away during the tech interview (which I lead). He goes for a coffee and waits there until we’re done. So basically he gets paid for drinking coffee in our office.
  8. It’s funny to see how an extravert, upbeat HR guy brings a terrified candidate to the meeting. At the first glance I would hire the HR guy, not a shy, scared dev.
  9. My coworker received an offer for my position from an agency when I decided to leave the company.
  10. My boss received an offer from an agency to hire a colleague who was just about to leave the company.
  11. When you get hired and work for a while, the agency calls you from time to time to ask how is it going. That’s what my coworkers told me. As an employee, I would find this annoying.

I have no way to convince my soon-to-be-ex employer to stop wasting money with HR agencies (or invest in the right one). But I still wonder why good devs in Poland respond to the job ads posted by agencies where you don’t have a target company’s name disclosed. The salary range is not specified either. If I invest my time talking to the HR guy, I would like to know in advance who am I going to work for and what salary I can expect!

Carving my own path

Like I said in the beginning, I have never looked for a job through an agency. I have some basic googling skills, but let me show you a brief story of my career which provides some more tips:

  1. I met my first employer during high school. It was a local news company. My friend had already worked for them and he asked if I could make some photos at local events. Still being a student, I started my humble photography career in my spare time. After my high school exams (Polish matura) and before going to university, I got a permanent job offer. They discovered I can code PHP for food, so they hired me to work on their new website during the day AND make photos in the evenings. Weird setup, but this job gave me a lot of life experience.
  2. After five years, I met my future girlfriend. I decided to relocate to Gdańsk which is ~300 km away from my hometown. I started browsing pracuj.pl, a Polish job board and soon I found an offer from an education company. I sent my resume, did a recruitment task and after three weeks I got an offer! I’ve been working there for over four years. During that time, my salary has doubled and I went from a mid developer to a team leader.
  3. Me and my girlfriend decided to change our lifestyles and travel more. I needed a full remote job. I remember meeting a nice software house at a PHPers conference in Poznań. I sent my resume. Unfortunately they just stopped their recruitment process at a time. But after a month I received an e-mail inviting me to a new process, which I passed successfully. I had three interviews via phone and Skype: entry, tech and soft skills. I really liked all those interviews. One of my interviewers said he saw my PHPers presentation about database optimization and he already knew my name. I got the offer with a salary which allows me to live decently, buy more music equipment (oh yeah!) and even save money for retirement.

As you can see, my ~10 years career as a developer did not include any deals with any recruitment agencies. It is important to say that I’m not an easy-going, upbeat and extravert person. I’m not good at getting my foot in the door, but at least I’ve learned how to create my resume properly, find a possible employer and make him interested in my offer. I’ve spent a lot of time browsing Polish nofluffjobs board where all the offers are plain and simple, with salary specified upfront.

What’s more to do? I guess I should visit more conferences and give more talks, so that people will know my name. I should write more blog posts and possibly contribute to Open Source. That way I’m going to develop my personal brand and hopefully do my business stuff without the HR agencies.

If you work in an HR/talent agency, please improve your skills. I know that IT headhunting is very hard and it might be really frustrating to abuse LinkedIn for the whole day just to receive a bunch of rude replies (or no replies at all). But if you want to do a good job as a headhunter, you need to understand how candidates and companies behave and what they really need. We’re all here to make business happen, right?

Is a managing position good for me?

When you’re an engineer, sometimes you might get an offer to take over management duties. It might be for example leading smaller projects, being a scrum master or leading a whole team of developers. When to consider such options and how to get prepared for new challenges?

I assisted my team leader for two years in his management duties. Then I became a leader myself and rised employment to 8 developers. I learned a lot about working with people. After a year I decided to change my job and have some rest. Having some perspective now, I decided to share my experience.

Do I have to be a manager to get a raise?

It depends on the company policy. Management is not for everyone – you have to feel it and like it. Some companies know it and value both talented engineers and managers the same. Other companies give more value to employees who are eager to step out of line, take the wheel, lead projects, watch the budget and the deadlines. It’s good to make this clear during recruitment talk.

What do I get?

It’s nice to have a fancy job title in our e-mail footer, but it should not be a main reason to take a management role. I can see some other pros:

  • real influence on the way the team works and the projects are managed
  • soft skills development: working with people, knowing their personalities, d1eveloping non-verbal communication, resolving conflicts, keeping good atmosphere
  • satisfaction from mentoring and supporting others; you have a big influence on their careers
  • you will get credit for your team’s successes from the stakeholders

Do I fit into this role?

Think about what kind of boss would you like to have. And then become such a boss.

Programming is a team work and it requires trusting people, being a team player and taking responsibility. This is why giving commands does not work well here. We assume that a development team consists of intelligent, mature and open people – and we have to treat them like this.

As an employee, I’d like to have a boss who:

  • has a sense of humor 🙂
  • is open to discussion, does not force his will – but can have a deciding vote when needed,
  • respects other opinions, fosters having a dialog,
  • eases conflicts instead of making them worse,
  • has a positive approach, but also keeps both feet on the ground,
  • evaluates my work in an honest way,
  • clearly communicates his intentions and concerns,
  • shares experience and motivates others to move forward,
  • delegates tasks, does not leave the best for himself (herself),
  • doesn’t force anybody to work overtime,
  • is not a control freak.

Start with small steps! You can always take the initiative in your team even if you don’t have an official manager’s title. Help people in their everyday struggles and propose solutions. Actively participate in retrospectives and other meetings. If you see that your coworkers really appreciate your support, invite to discussions, share ideas and respect your opinion – it’s a sign you’re a good material for a leader.

How can I keep my development work while being a manager?

If you take a managing job, you’re going to lose some time spent previously on developing things and spend it on managing different relations with a lot of people. Your workday still has 8 hours (please resist the urge to work overtime!) and you have to allocate your effort wisely.

It might get hard for you to find time to peacefully finish your code, write good tests and refactor. You’ll find yourself delegating your favorite coding tasks to others (and that’s a valuable management skill!). You’ll notice that your teammates learn new technologies faster. After a couple of years you might feel staying behind the competition because management duties take most of your time. If you decide to change your job and switch back to coding full-time, you might have trouble following the latest trends.

Me and my fellow managers always had a favorite part of the day, like early morning or late afternoon, when we could do some coding in silence, completely undisturbed. However, turning the lights on (or off) every day in the office is not a good solution. It’s crucial to give the team a clear message when you’re available for them and when you should have some peace. You can always say “I’m sorry, I’m busy right now, give me a minute and I’ll get back to you”. Also, try to educate people, so they can find a solution for themselves – don’t do their work.

Having more duties requires refining your self-organization. Take care of your desk, mailbox, Slack conversations – keep these places clean. Eliminate distracting factors, configure notifications from different apps to avoid notification fatigue. Know your daily rhythm – your best and worst working hours.

Warning signs

Sometimes even though you’re doing your best, things aren’t working the way you expect. Maybe there is something you cannot influence? Sometimes it’s better to turn down a bad offer than waste your time and stress. Be careful if:

  • many people quit their jobs and you received an offer only because no one else wants to take it,
  • the company is trying to put too much tasks and responsibility on your plate,
  • the atmosphere in your team and in the company is bad and no one can see any solution,
  • the company always cuts the corners when you see an urgent need of investments, hiring more people or giving them raises,
  • you feel like you won’t get any support from other, more experienced managers and you’ll have to figure everything out yourself.

Some companies are just bad. Don’t let them overload you with work that’s not yours! If you see your employer cutting corners all the time, you’ll soon find yourself in charge of managing projects, meeting deadlines, talking to clients, recruiting people, mentoring, reviewing code, coding itself… You can’t do all these things simultaneously.

Keep your head up!

Managing an 8-person team was not easy for me. It required a lot of energy, creativity, perseverance and optimism. I made a lot of mistakes. I was tired and I decided to go back to coding full-time and make up my technical abilities. However, leading a team was a very valuable experience from me and maybe some day I will consider a similar offer again – if I get one.

If you’re not a complete outcast and a loner, maybe it’s worth trying to have some management duties. Even without being a formal superior, you can influence your team by being proactive. You should learn how to deal with people, manage your time and take responsibility for the business that hires you. Don’t avoid new challenges!

Take a look at this wonderful talk about leadership

When your SQL database is missing foreign keys

…then sooner or later, you’re going to have a bad time. Bugs in your app or users’ recklessness will cause your database to be inconsistent.

An example from my job: a system had users and users_categories tables. While registering their accounts, new users entered not only an e-mail address and a password, but also selected categories, like teacher, student, parent. The data were immediately inserted into a MySQL database. But the account had to be activated via e-mail. A script was executed every day to purge inactive accounts. It wiped records only from the users table, not users_categories. There were no foreign keys to block that behavior.

Every company has some erroneous legacy code here and there. I saw databases reaching 60-70 gigabytes in size, with hundreds of millions of records and not having foreign key constraints because someone… was afraid of them. A long time ago, one database in the company had ON DELETE CASCADE foreign keys, which means deleting one record caused cascaded deletions of related records. My colleague destroyed almost half of a test database, so he decided to remove foreign key constraints. That was a classic misunderstanding of the technology that we were using.

How a foreign key constraint works?

In a relational database, foreign keys are meant to secure the references between entities. For example, in a users_categories we have two relations: with users and categories tables. No row in the users_categories table cannot reference a non-existent user or category.

The only exception is if a row contains NULL values. This is a way to create optional relations.

Creating a foreign key usually looks like that (based on InnoDB engine in MySQL):

ALTER TABLE users_categories
ADD CONSTRAINT fk_users_categories_user_id
FOREIGN KEY (user_id) REFERENCES users (id)
ON DELETE RESTRICT

We create a constraint with a specified name (we don’t have to, but it’s good to have a name that we can later use during reverting a migration). We specify a column which should be restricted and a destination table and column which we want to refer to. In the end, we decide what happens if we try to delete related data. A default behavior is to restrict such query and issue an error, for example if we try to remove a user without removing his (her) categories list. We can also decide to have a cascade deletion or set NULLs, but I always choose the safest option which is to RESTRICT.

Foreign keys require that:

  • both columns (the one we create a constraint for and the one we refer to) must have exactly the same type; a SMALLINT column cannot refer to INT,
  • there must be an index in the source table for the column; it can be a multi-column index starting with that column; if no index is matching, a new one will be created
  • a constraint must have a unique name in the schema scope

Introducing foreign keys in an inconsistent database

If you have some existing data without foreign keys, you need to clean it up first. That’s a challenge because sooner or later, a database without foreign keys will be a mess. MySQL will not allow us to create a foreign key on wrong data – unless we issue a SET FOREIGN_KEY_CHECKS=0 query before adding constraints (that’s what mysqldump does by default). However, we would like to have proper data not only until now, but also to fix what we already have. We need to somehow untangle that existing spaghetti.

I decided to analyze existing data to know:

  • how many records with wrong references do I have
  • what strategy will be the best to fix these particular records
  • are there differences in column types

On every table I executed a query showing all records with wrong references:

SELECT * FROM users_categories WHERE NOT EXISTS (
SELECT * FROM users WHERE user_id = users.id
)

In the example above, if the user_id column would allow NULL values, we should add a following condition: user_id IS NOT NULL.

I noticed a funny trend between developers not using foreign keys: they define optional references as, for example, INT NOT NULL DEFAULT 0. This is not correct because usually, in referenced tables with AUTO_INCREMENT or SERIAL primary keys, there are no records with id = 0. So introducing a foreign key in this case will not work. I had to modify the table schema and then change all 0s to NULLs. Let’s say that a user might have an optional reference to a city:

ALTER TABLE users MODIFY city_id INT DEFAULT NULL;

UPDATE users SET city_id = NULL WHERE NOT EXISTS
(SELECT * FROM cities WHERE city_id = cities.id);

Here we can see a strategy of setting NULL every time we cannot find a destination entity. We don’t want to remove a user’s record just because it refers to a non-existent city. NULL means a value is unknown, uncertain (it’s not the same as an empty value or 0). If we have an incorrect city_id value, we cannot specify which exact city it refers to – so we set it to NULL.

Other strategy can be taken for many-to-many join tables. In the users_categories table I mentioned, if we have invalid user_id, category_id or even both – this means that the whole record can be removed:

DELETE uc FROM users_categories uc WHERE NOT EXISTS (
SELECT * FROM users u WHERE u.id = uc.user_id
) OR NOT EXISTS (
SELECT * FROM categories c WHERE c.id = uc.category_id
)

When you pick a certain data cleanup strategy, you need to know the business well. Maybe the wrong data is used in some reporting systems and if you suddenly fix the stats, it will confuse people because they relied on these data for a long time. They might perceive your ingenious fix… as a mistake 🙂

Fixing complex cases

Sometimes, relations between entities form a long chain: for example, a user puts and order which consists of several items, and every order item refers to a product, and every product… has been added by some admin user. At first you should check the most generic entities (like products) and then dig deeper into orders and order items. Take a look at the following steps to see which strategy I pick in different scenarios:

  1. Check if there are products added by non-existent users. An information about a removed user is not necessary for the system to work. A wrong user_id can be set to NULL.
  2. Check if there are orders with wrong user_id. An orders history is essential for the legal, tax and reporting purposes. We cannot remove any orders, so we just set user_id = NULL if a user does not exist (maybe he was not logged in?).
  3. Check if there are any order items related to non-existent orders or products. Items contain important invoicing data like net and gross price. But if a report requires a proper reference from orders_items to orders table, and we have queries like SELECT … FROM orders JOIN orders_items …, then a JOIN clause eliminates wrong records anyway. So we can remove items that refer to wrong orders. The same situation applies to product_id. We won’t prepare sale reports for products which do not exist.

Of course before executing such dangerous queries we need to have a backup. And while we analyze the details of the system we’re trying to clean up, we should consult as many people who have a business knowledge as possible.

I know it would be nice to have an automatic script to clean up all the data before adding constraints. But there is no silver bullet because every case is different. You need to use your creativity and intuition! It’s not easy, but in the end, satisfaction will be great.

Some interesting, further, external read:

6 Things That Can Ruin Corporate Trainings

How to gloriously waste three days of a 10-person development team? Send them to a wrong training led by a wrong person. As a team leader I made this mistake twice. There will be no third time.

One training was about microservices and the second one about refactoring and design patterns. We observed some really annoying anti-patterns of a technical training:

  1. Lack of preparation for practical exercises. You could say, “slides are cheap – show me the code”. And the code was bad. Our coaches had problems with IDE and dependencies. Their development environment differed from ours. We all strugged with proper configuration, wasting hours sitting on StackOverflow. It destroyed students’ motivation and coaches’ authority. We could avoid these problems if we earlier agreed to use the same environment and if the coach would check his scripts the day before.

    The exercises were chaotic also because both coaches did not really have an idea what exactly they want to do. First we created a class, then removed it and created another one without a clear purpose. Then we set up a whole different project from scratch. We wrote a lot of “example” code which eventually was moved to trash anyway. It could not serve as any aid or template during our work.

    I think we should have had a clear plan for the training and work on one codebase during all three days, slowly improving it. A clear goal should have been set.

  2. A coach does not allow discussion. This is bad because you can develop stuff in many different ways. A coach should not talk from a “guru” position. Maybe still he could learn something from his (her) students? A coach should not cling on to the agenda if he sees that students have other ideas and needs. He should not treat students’ feedback as an attempt to underestimate his authority.

  3. A coach demands 100% focus all the time, from everyone. I saw a coach who loudly reprimanded a girl who was handling company texts during training. She politely explained that it was an emergency. You can always have an emergency if you leave your company for a couple of days. The second thing is that every human has different hours in which he or she has most attention.

    As a coach, you should recognise symptoms of fatigue or lack of interest. Maybe your students need a coffee break? Maybe they disagree with you, but you speak so loud and fast they are afraid to interrupt you? Ask the group if everyone is okay and if anyone needs a break.

  4. A training company hires a coach who uses certain technology only from time to time. The company promised us they would craft a special training for our custom needs. But we had no idea that they would force a JavaScript coach who had “some previous” experience in PHP to conduct our PHP design patterns training. It turned out that we were the ones who taught our coach, not the other way around.

  5. A coach does not admit mistakes. If students tell you that you are wasting time, trying to configure some tool by copying and pasting from StackOverflow for two hours – maybe they are right? Maybe you should let go? You didn’t prepare, you’re not handling it – it’s okay, we’re humans. Just apologize and admit you made a mistake, try to move on, adapt.

  6. A coach makes an unnecessary barrier between him (her) and the students. He is too formal. Come on, developers are cool guys, don’t call them “mister” and “miss”! Let’s treat a training like a place to share knowledge and have a discussion like colleagues do, not in a “master-student” relation. So as a coach, don’t run away during a break. Have a chat with students, be available, share some funny stories.

    However, don’t try to be too cool, too “homie”. Be yourself, be authentic. You won’t build trust and authority by trying to be someone else. Also, don’t try cheap motivational talks, be specific.

My last advice for anyone who tries to train people: show yourself at conferences. Build your own brand, practise your talks, prepare for the hardest questions from the audience. Build your authority by showing your skills and openness.