In today’s era of generative AI, there are countless ways to get started with AI. However, for engineers without a background in AI or machine learning, the overwhelming number of buzzwords can make it hard to know where to begin. That said, there’s no denying the productivity gap between those who know how to use AI and those who don’t.
This article aims to give software engineers a fast-track introduction—a practical guide to navigating and thriving in this new landscape.
The mind map below outlines the flow of the article. We’ll start with how to use AI effectively, then move into how to build things with it, touching on key concepts along the way. Let’s dive in.
How to Use
When people talk about using AI, it’s impossible not to bring up how it all started—ChatGPT was the moment generative AI really hit the mainstream. After that, every major company started launching their own chat models.
I’ve listed four models I personally use almost every day. You might wonder why I switch between them. Simple reason: free plans come with limits, so rotating helps me stay productive without paying.
- ChatGPT – This is the one I use the most. Whether I’m writing, editing, brainstorming, or just trying to get a fresh idea out, ChatGPT usually gets the first draft going.
- Claude – When it comes to quick scripts or anything related to the command line, Claude feels the easiest to work with. For instance, if I need a curl command to upload a JSON file with auth, I’ll ask Claude.
- Gemini – I use this mainly for more in-depth research. It gives off a more grounded vibe, which helps when I need something solid to work with.
- Grok – Once I hit Gemini’s limit for the day, Grok usually takes over.
These tools make it really easy to fold AI into everyday tasks. For most situations, this setup covers everything I need. But when I’m working on something more specific—like building a presentation—I’ll bring in different tools.
One I rely on a lot is Gamma.app. It’s changed how I make slides.
I already had a pretty good rhythm from doing talks regularly, so I can usually outline things quickly. But Gamma takes it even further. I just give it a prompt, let it build a rough version, and then tweak the parts I want to improve. Something that used to take half a day now takes me about an hour.
Another one I keep coming back to is Perplexity.
Since generative AI is basically predicting what words come next, it’s not always accurate. That’s where Perplexity helps—it’s the tool I use to cross-check facts or dig up references. Sure, other AI tools have similar features, but I’ve set Perplexity as my browser’s default search engine, so it’s the quickest for me.
I use a few other tools depending on the project. For example, if you work with Confluence Cloud, Rovo Chat is a solid option that fits nicely into that workflow.
Vibe Coding
For software engineers, vibe coding has become a key part of using AI effectively to boost productivity. But getting good at it takes a lot of hands-on practice—and even just picking the right IDE and agent can take serious trial and error.
Beyond that, no matter which agent you go with, you’ll still need to plug it into the right ecosystem to unlock its full potential. Some rely on MCP-style control flows, others on rule templates. Here are a few specific needs I personally care a lot about:
- Task master: If your instructions to the AI aren’t clear enough, it’s easy for the model to get stuck—wasting tokens without producing anything useful. This was highlighted in Apple’s paper, which shows that as task complexity increases, model performance can collapse entirely. That’s why task decomposition is essential. Task master is a solid open-source option for that.
- Memory bank: Since LLMs have limited context windows, they tend to forget past mistakes or important task details. A persistent memory mechanism helps address that gap.
There are plenty of other tools that can be added, depending on your development habits and how far you’ve gone with vibe coding. It’s really about building the setup that fits your workflow.
Underneath all of this is prompt engineering—everything starts with how you communicate your intent to the model. Getting that part right matters a lot. Fortunately, Google has published a pretty thorough whitepaper that’s worth reading if you’re looking to understand the fundamentals.
If you want to go deeper, there’s a growing body of research too.
For example, this paper summarizes 26 different techniques for improving prompts—worth checking out if you’re serious about refining your workflow.
Vibe coding is half tools, half mindset—and prompt engineering is the bridge between the two.
How to Develop
Once you’ve got the hang of using AI, the natural next step—especially for engineers—is to start experimenting with what AI can actually do for you. Personally, I’ve built a handful of small tools that I now use regularly at work, like a code review agent and a text-to-command-line agent.
These tools each focus on solving very specific problems with AI. So how do you even start building something like that?
Model selection
The first step is understanding what resources are available. By “resources,” I mean which AI services expose endpoints that you can actually call.
If you’re willing to pay, then the big-name providers are all viable. But if you’re trying to keep costs low, what are your options? Turns out—plenty. Here are a few I actively use:
- OpenRouter: This platform gives you access to a wide range of models, including quite a few with free quotas. In fact, even models like Google’s Gemma 3:27B have a free tier here.
- Ollama: If you’re worried about running through OpenRouter’s quotas, you can always fall back to local setups. Ollama is plug-and-play and runs locally. The only trade-off? Heavy models are tough to handle on a local machine—but smaller models are often too limited.
- gemini-balance: This one’s kind of clever. Gemini offers a free tier that’s actually free—it doesn’t sneak in charges once you pass a usage limit. The catch is, the quota’s small. But if you can cycle through enough free-tier tokens, you can effectively run things at zero cost. That’s exactly what gemini-balance helps with.
Here’s how I’ve set things up:
- For important tasks, I rely on gemini-balance. The Gemini 2.0 models are just that good—I trust them with higher-stakes stuff.
- For lighter tasks, I go with OpenRouter, especially when I want to use Gemma 3:27B. It’s a strong model, but OpenRouter doesn’t support function calling for Gemma, so I keep it for simpler jobs.
- For embedding, I use Ollama locally. Embedding isn’t very compute-intensive, but I run it at scale, so I’d rather not worry about hitting quotas.
As you can probably tell, picking a model isn’t just about performance—it’s also about constraints, access, and cost. Each model comes with its own trade-offs, so understanding those is key to building something reliable.
AI Application
Once you’re familiar with the tools, the next step is figuring out what kind of applications you want to build.
This is where imagination comes in—but regardless of what you’re building, you’ll eventually run into the concept of RAG.
RAG (Retrieval-Augmented Generation) is one of the most effective ways to unlock real-world utility from LLMs.
Why?
Because language models can’t access huge documents directly—their input limits are real. But many tasks (like customer support) require deep background context. That’s exactly what RAG helps with.
I won’t go deep into RAG architecture here. If you’re curious, I wrote a more detailed piece: Evolution of RAG: Baseline RAG, GraphRAG, and KAG.
No matter what kind of app you’re building, sooner or later, you’ll hit the need for fine-tuning. Most base models are trained for general tasks, but real applications require customization—your data, your workflows. That’s where fine-tuning comes in. Of course, fine-tuning isn’t trivial—it assumes some background in ML. I’ll cover that in another post.
And then there’s evaluation and observability—two things you must consider when your AI app is live. You need to know if the model is doing its job, and why it failed when it didn’t. Tools like LangSmith and LangFuse can help with this, but you’ll need to spend time experimenting with what works best for your stack.
Conclusion
So yeah—diving into AI development is a big journey.
From picking tools, to writing prompts, to deploying apps, there’s a huge surface area of decisions.
But don’t let that intimidate you. AI is still new, and most of us are learning as we go.
My advice: start with problems from your day-to-day life. That’s the best way to learn. You’ll naturally uncover more tools, more patterns, and more techniques as you build.
Eventually, you’ll develop your own set of “survival skills” for the AI era.
This article includes a pretty massive mind map—it touches on nearly every major area in the Gen AI space.
You don’t need to master everything—not even close. Just start with what you care about, build one small thing, and the rest will follow.
In this new AI era, curiosity and momentum matter way more than credentials.