The Importance of Active Learning in Data Science and Engineering

1960

By Andrew Bolster, machine learning engineer at WhiteHat Security

Back when I was pursuing my undergraduate degree in electronics and software engineering, I couldn’t imagine a path that would lead to me working with NATO on port protection and maritime defense, teaching smart submarines how to trust each other. But while I was working toward a Ph.D., that’s what happened. Instead of following the path into academia, a friend enticed me to work with him on biometrics. From there, I found an opportunity to apply my skills and knowledge to the cybersecurity industry – but that’s not something I could have predicted either.


The new year has me reflecting on the roundabout approach that led to my current role as a machine learning (ML) team lead with WhiteHat Security. I think it’s important to share some of the challenges, experiences, and opportunities I’ve been fortunate enough to have gone through with others who may be just at the beginning of a journey or professional career in data science and engineering. Like many, my experiences have shaped my views on certain issues and advancements in technology. For example, during the past eight years as a chairman of a charitable hackerspace in Northern Ireland, I have become a huge proponent of open data and transparent government. I also believe that with enough work, many of the world’s current challenges, from climate change, road traffic accidents, infectious diseases, obesity, and even corrupt governments can be faced with a combination of accurate, honest, data, and the advances in machine learning and data science techniques. 

If you’re planning to pursue a career in STEM, one important thing to consider is that feeding your interest in these fields is one of the smartest investments you can make. Look for every opportunity to engage and immerse yourself in conversations, and self-directed projects or research. If you’re still a full-time student, don’t panic about exams or even grades. With an understanding that your education could be out-of-date by the time you finish your degree pathway, know that your long-term success will more accurately be defined by how much you can learn and remain an active learner in your own experiences.

Another critical factor to building success in science and engineering is to find mentors early on. I can confidently say when seated at either side of the interview table, that the most active learners with a keen interest in their fields have often worked very closely with an experienced mentor along the way, and subsequently gift their time and experience to the next generation of learners that come after them. The role a mentor plays in keeping you engaged and curious cannot be overstated. Not to mention, a mentor can show you all the best ways to ‘win friends and influence people’ throughout an organization. 

As an example of active learning in my own career, my team was recently working on verifying vulnerabilities. Before, using scanning technologies to assess websites was difficult because the scanner can be quite ‘dumb.’ Our team is building a secondary system to apply ML and develop an augmented risk assessment system on top of the battle-tested scanner, to add experience over time. Whenever the ML has seen vulnerabilities on a website, they are clustered and grouped together and are verified by human experts. These experts then perform a deep dive into websites to investigate further, but it takes a lot of time and resources to do so. To serve our customers, we must be able to scale this process. All vulnerabilities are human verified, leveraging the collective experience they possess. When building with ML, the models improve as they receive more engagement from the subject matter experts – and this is where active learning and natural curiosity play a very strong role.   

One thing that ML is terrible at is pattern recognition on time series data. Despite all of the research into stock market trends, for example, ML is very bad at identifying patterns in noisy, time-varying, multi-value inputs. This is because ML doesn’t understand the reason for changes over time and their pattern correlations. What’s missing is the human ability to infer, correlate, and leverage past experience. Humans can say, “Of course that vulnerability won’t work, because this happened.” ML – whether fortunately or unfortunately – lacks common sense, and the ability to intuitively make these kinds of inferences. But this is an exciting time to pursue a career in this field because this is being researched at this very moment!

As you pursue STEM studies, make time to explore and follow new and emerging technologies. Bear in mind that by graduation, there could be entirely new fields and industries to get excited about. That’s a major upside to science and engineering – it’s constantly evolving!

Become plugged into the industry from a global perspective. Read about what’s happening in the world and draw your inspiration from the latest research or breakthroughs as you explore your own potential. But be sure to strike a balance between work and life, too. There are plenty of social opportunities to have fun and make friends at events like meetups, hackathons, or robotics competitions, and quite often, these relationships will form the basis of strong and long lasting personal and professional relationships. These relationships will support you as you build a reputation within the field, and even bring you professional opportunities that many would not normally have access to following classical pathways. 

Finally, remember that studying science and engineering is serious, and it’s rare that things will run smoothly every time. There are no experts, just people who have made (and learned from) more mistakes than you have. Expect the unexpected but keep your wits and a sense of humor about it all. Use the time to learn your own preferences – do you enjoy the precision and methodology of lab work, or are you more of a free spirit, who prefers to be in the field? By exploring these options early, you can figure out what appeals to you most and set your course with purpose.

About Andrew Bolster

A technologist, founder of multiple technology-based groups and organizations, an award-winning researcher, community leader, and an occasional public speaker, Andrew Bolster is currently the machine learning team lead at WhiteHat Security, using advanced data mining and machine learning to analyse potential hacking and malware attacks on cloud infrastructure. He was previously a data scientist working in the fields of affective computing and cybersecurity. Follow him on Twitter, @bolster