In this podcast, NACE members Joe Mazzella and Tom Hayden of Engineering Director, Inc. (Evanston, Illinois, USA) join the MP Interview Series to discuss the work they are doing to combat corrosion through the use of machine learning.
They provide a brief overview of their project that involves estimating corrosion growth rates in underground pipelines; details about the computer vision app they are creating; and an update on NACE task group TG 589.
More information on their work is available in the August 2020 issue of Materials Performance (MP). A complete transcript of this episode is available below.
[introductory comments]
Rebecca Bickham: Hi, Joe. Hi, Tom. How are you both doing?
JM: Fine. How are you doing, Rebecca?
RB: I’m doing great, thank you.
TH: Thanks. I’m doing great, too. Thanks.
RB: Thanks for being here. So this project of yours is one that we featured in our August 2020 issue of MP, and I wanted to have you both on the podcast today to give our listeners more insight and detail than what was included in the article alone. Joe, let’s start with you. You are the CEO of Engineering Director, Inc., also referred to as EDI. Would you please tell us a little bit about the company and your background?
JM: Sure. I’ve had a couple of career paths along the way. First and foremost, as it relates to today’s subject matter, is working for major high-performance coating manufacturers. My primary role was managing architectural and engineering teams to get specified. The other significant role was working as a magazine publisher in the construction industry for McGraw Hill Construction. I was publisher of the Chicago Architect magazine for the American Institute of Architecture, and also the regional ENR magazine for Midwest construction.
RB: Wonderful. Anything else, Joe?
JM: No, not in terms of my background. Would you like to know a little bit about Engineering Director?
RB: Sure.
JM: Okay. Currently, I’m the CEO of the consulting firm called Engineering Director, EDI. We’re focused on spatial analytics and artificial intelligence, primary associated with earth, atmospheric, and environmental sciences and their related impact on underground and atmospheric corrosion. So that’s what we’re focused on right now. Tom and I like to refer to Engineering Director as a boutique consulting firm specializing in developing, implementing, and measuring lean business processes and strategy through the use of information technology and, as it’s spelled out in the article, geographical information systems. At the core, we’re primarily data scientists that provide what we call real-world applications. What we refer to as full-data science spectrum. So we develop advanced machine learning algorithms, we build data science web applications, and then we distribute the data science throughout the organizations.
RB: Thanks, Joe. Tom, you're the CFO of EDI, and you’re a lecturer at Northwestern University. Could you tell us about your background and how you become involved in the corrosion industry?
TH: Definitely. My background is actually in consumer tech. I don’t come necessarily from the corrosion business. I was a computer science graduate student, and I left that and I worked for Facebook in their Risk & Payments division, helping — this was back when people used to play Farmville. I don't know if any of your listeners are Farmville players. But if they played Farmville back in 2011, I was keeping your Farmville cows from being stolen. I totally come from the consumer tech business, which has adopted machine learning and lot of these advanced technologies probably for about a decade now, if not more.
From Facebook, I moved to a company called GrubHub, which if you're in the U.S. is sort of the dominant player in online food delivery. That’s actually what led me to meet Joe, is when I worked at GrubHub, we use a lot of weather and atmospheric data in our business. You can imagine if you order carry-out a lot, like you were going to order carry-out when it’s rainy out, or everyone orders carry-out when it’s snowing outside. So the business was actually very correlated to the weather. We would use a lot of weather data to try to predict how many orders we’re going to get so that we could staff customer service and staff how our business operates. Drivers, stuff like that.
I run an open-source library for processing real-time data from NOAA, real-time government weather data, and that’s what led me to meet Joe, because Joe was looking for real-time and historical weather data for a lot of the work that he was doing in corrosion. That’s how I entered the corrosion space, coming tangentially from consumer tech through the weather through to corrosion. But it’s actually been really fascinating. I really love working on these problems, in particular, because it’s a different twist. You can use weather data to predict corrosion, to predict all sorts of aspects of the industry. It’s just been a fascinating endeavor.
RB: So, Joe, you’ll probably want to take this next one. Could you give our listeners a brief overview of the project that involved estimating corrosion growth rates in underground pipelines?
JM: Sure. The business case for the paper involves how to cost-effectively and efficiently assess the environmental conditions and then the related impact of corrosion on underground pipeline using geographical information systems and spatial data that we spoke about, but with limited excavation. I think that’s the key to this. We are more proactive than reactive in predicting the likelihood of corrosion at locations over a broad geography. In the case of this project, the broad geography was over North America.
Then the business case for the project, the objective, was to proactively target those areas that have the highest likelihood of advanced corrosion based on the rate and degree of corrosion, and because of that, reduce risk of failure while we were trying to maximize both the capacity and related cost of inspection. In more technical terms, we used geostatistical analysis tools to, I guess the best way is, to emulate what is occurring in the landscape that is of interest or that may impact corrosion, such as pH and electrical conductivity in the soil; power lines nearby that contribute to alternating current; AC interference, perhaps with rectifiers; magnetic anomalies, paleomagnetic in the ground and maybe atmospheric; road salts; and other known contributors to corrosion of underground pipeline.
Then from there, use this data to generate inputs into machine learning, which is Tom Hayden’s area of expertise.
RB: Let’s move on to you, Tom. Would you discuss the machine learning aspect of the project?
TH: Absolutely. If you take a step back and think about why companies or why anyone chooses to use machine learning, one of the reasons is for risk mitigation and for optimization. If there’s ways that you can run your business better or highlight things that you might miss normally or identify risks that you normally wouldn't find, machine learning is tremendously valuable. For instance, we used it at Facebook, we used machine learning all the time to try to — I mentioned the Farmville cows, but we had all sorts of machine learning models to try to make sure that there weren’t fraudulent transactions going on, to identify high-risk actors in the network, and so on. This isn’t actually all that different.
What we chose to do is we essentially, from a North American pipeline operator, we had ILI [in-line inspection] observations on about 2.2 million girth weld addresses. We wanted to ask the question, given those ILI observations and the growths that they have actually observed on those ILI observations, Can we build a model to try to predict it? So if there’s a future — so that we can predict. Can we find errors? Can we optimize their process? Can we optimize them doing digs? How can we optimize their process for identifying risk and for running their business? So we built this model.
As Joe mentioned, we brought in all sorts of exogeneous variables. We brought in things like heliomagnetic data. AC interreference is a big topic in corrosion right now, so we brought in some AC — variables related to AC interference because we have data on the proximity to power lines. We brought in atmospheric data. We brought in soil data. We brought in as much data as we could. We essentially built a model. The way machine learning works nowadays is you essentially give all these variables to the computer and you say, “Figure out which variables matter and construct this model for me.”
We put it all in, it built the model, and then we were able to identify, for any girth weld address or any new girth weld address or any new point along the pipeline, the potential risk factors. We could say, “What is the probability that this corrosion growth rate is going to be above X?” Or we can also ask the model, “What do you think this specific growth rate is at this specific location?” It allows us to take a really interesting macro view where we can zoom out, see the overall level or corrosion and the overall estimates. Then we can start to zoom in and identify points of risk.
RB: In addition to all that, EDI is also creating a computer vision app, which can detect corrosion and tell about the level of corrosion. Who wants to speak to that?
TH: I would love to talk about that. This is a project I have really enjoyed. With a background in machine learning, computer vision is part of machine learning. Computer vision is using images to identify things. You may not know this, but you probably interact with computer vision algorithms all the time.
Any time you interact with facial recognition, so if you're unlocking your phone with you face, there’s some computer vision going on there. Or anytime, if you use Snapchat or TikTok or any of these fun apps that do computer vision stuff. They transform your face or make you look old or make you look young. There’s all these cool apps. They all use computer vision. I wanted to bring some of this technology that is really popular in the consumer space over to corrosion.
What we did is we built an algorithm where we fed in about 10,000 pictures of corrosion. The Chicagoland area is not short of corroded things, so I walked around the Chicagoland area, and I took about 10,000 pictures of corrosion. So things on elevated train tracks, things on buildings, on architectural elements, just all over the place. We built this algorithm, and it’s a computer vision app. What it does is it takes a photo and, using all this data I plugged into it, can identify if corrosion is in the picture. We were able to do that successfully.
We have an app in the App Store called Optical Surface Recognition. You can go download it yourself and try it now and point it at a surface, and it will tell you if that surface is corroded or not. It will give you a confidence score to evaluate how confident it thinks it might be corroded. For instance, if you were to point it at a wooden floor, you would probably get a confidence score of very close to zero. But you could point it at something that might have a little bit of corrosion visible, like there might be some bubbling in the coating or something, and you might see that confidence score go up a little bit.
Eventually, what we want to do is, this is sort of our proof of concept. Like, let’s build this app, it’s pretty cool, it can detect corrosion, it can detect some confidence scores, and let’s start layering more technology and more stuff on there. The ultimate goal is we want to get to a point where we can say, “Well, let’s detect levels of corrosion.” So not just “Is it corroded or not?” but “Let’s also detect how corroded is the surface.” Is it 1% corroded? Is it 5%? Is it 50% corroded?
Then, once you have that, you can start to do really cool stuff and say, “What type of corrosion is it? Is it flaking? Is it chipping? Is the coating damaged? How serious is the pitting?” You can start to do really interesting computer vision apps with nothing more than consumer-grade cameras. That allows you to have some really cool ideas you can think about. You can imagine putting them on piers as boats go by. You can scan the boat for corrosion on the exterior. You can start to do some really cool applications that are passive in nature but might have a big impact. Again, it’s called Optical Surface Recognition. I invite your listeners to go download it in the App Store. I think if you just search the Apple App Store, “Optical Surface Recognition,” it will pop up and you can download it and play around with it. It works on videos, on real-time videos, and it works on images. Definitely check it out.
RB: Thanks, Tom. So you're also heading up a NACE Task Group, TG589. Could you tell us about that as well?
TH: Definitely. I think this is actually very important. Other industries — for instance, radiology — what they have done is they’ve built these really interesting datasets. For instance, in radiology, of mammograms. So they went out and they’ve collected tens of thousands or hundreds of thousands of different pictures of mammograms, and they’ve then gone through and labeled all of them. Like, Is a tumor visible in this image? Is a tumor not visible in this image? Where is the tumor in this image?
What that has done is that’s created a revolution in machine learning technology as it applies to radiology. A bunch of computer scientists have all of a sudden found these datasets and started building really interesting computer vision applications. For that industry, it’s really moved the industry forward quite a bit. I personally have a vision, and I think the industry would really benefit from having a working group like this, like what they’ve done in radiology, to collect images.
One of the things you need for computer vision is a great diversity of images. Images of corrosion on the Chicago L train only gets me so far. We need images of marine-based corrosion. We need images of pipeline-based corrosion. We need above-ground, below-ground corrosion. We need corrosion on different types of architectural elements that may not be in my training set. That’s what you really need. You need to build this really large and diverse set of data.
That’s what I think that value of the working group is, is we can interact with people in other industries that are not —. Joe and I work primarily in the pipeline business and in the pipeline corrosion space, but there’s all sorts of interesting aspects in coatings and marine corrosion. That’s part of the vision, is to get all these different groups together and collect this dataset with the idea that it will bring in computer science researchers and people that do algorithms and people that do computer science to start building really interesting technologies that might move the industry forward in terms of automation.
RB: Wonderful. This last question is for both of you. What plans do you have for the future? What are you going to be working on in the future? And is there anything either of you would like to add before we conclude today’s podcast?
TH: Definitely. I’ll take this first and I’ll let Joe chime in. We have a paper in the NACE 2021 Proceedings, which is an advancement of what we did in the 2019 and this year, which is building more advanced machine learning algorithms. We started to identify new variables to bring in, especially new variables from what we call the plant/soil atmosphere, which are things related to soil and soil conditions and soil wetness and conductivity and resistivity. We have a model going there.
We’re also working on, especially in the Optical Surface Recognition, building more advanced algorithms, and as I mentioned, building in some of these additional components into the algorithms, like flaking and things like that. Joe, did you want to add to that?
JM: I think you did a good job. We’re also working on launching a couple of tools, key tools, geographical information or GIS tools. We’re launching a GIS platform and related tools.
TH: Yes, at the end of the day, for all of this to work, we need really good GIS systems. Almost everything in the corrosion space has some geographical element to it, whether it’s because it’s in a certain climate or whether it’s because it’s underground in a certain location. Before we can do any of the really cool machine learning stuff, we have to do a lot of GIS stuff, which involves identifying things, where they are, where they are in relation to other certain elements, weather conditions, bodies of water, flood plains, things like that.
They all contribute input into this. Building out a very strong core of GIS processing system is our biggest priority right now. I think we’re making a lot of progress on it. If you visit our website, engineeringdirector.com, you can read more on it. We’ve got a lot of really cool stuff on there. Yes, that’s definitely the direction we’re going in.
RB: Thank you so much, Joe and Tom, for joining me today. This is where we’ll end things.
[closing statements]