Gemma 4 Tool Calling Explained: Build AI Agents with Function Calling (Step-by-Step Guide)

Must Read
bicycledays
bicycledayshttp://trendster.net
Please note: Most, if not all, of the articles published at this website were completed by Chat GPT (chat.openai.com) and/or copied and possibly remixed from other websites or Feedzy or WPeMatico or RSS Aggregrator or WP RSS Aggregrator. No copyright infringement is intended. If there are any copyright issues, please contact: bicycledays@yahoo.com.

Think about asking your AI mannequin, “What’s the climate in Tokyo proper now?” and as a substitute of hallucinating a solution, it calls your precise Python operate, fetches stay knowledge, and responds accurately. That’s how empowering the software name features within the Gemma 4 from Google are. A very thrilling addition to open-weight AI: this operate calling is structured, dependable, and constructed immediately into the AI mannequin!

Coupled with Ollama for native referencing, it lets you develop non-cloud-dependent AI brokers. The very best half – these brokers have entry to real-world APIs and companies domestically, with none subscription. On this information, we are going to cowl the idea and implementation structure in addition to three duties that you could experiment with instantly.

Also learn: Operating Claude Code for Free with Gemma 4 and Ollama

Conversational language fashions have a restricted information based mostly on once they had been developed. Therefore, they will provide solely an approximate reply if you ask for present market costs or present climate situations. This lack was addressed by offering an API wrapper round frequent fashions (features). The intention – to resolve a majority of these questions through (tool-calling) service(s).

By enabling tool-calling, the mannequin can acknowledge:

  • When it’s essential to retrieve exterior info
  • Determine the right operate based mostly on the offered API
  • Compile accurately formatted methodology calls (with arguments)

It then waits till the execution of that code block returns the output. It then composes an assessed reply based mostly on the acquired output.

To make clear: the mannequin by no means executes the strategy calls which were created by the person. It solely determines which strategies to name and construction the strategy name argument checklist. The person’s code will execute the strategies that they known as through the API operate. On this state of affairs, the mannequin represents the mind of a human, whereas the features being known as signify the palms.

Earlier than you start writing code, it’s helpful to grasp how all the pieces works. Right here is the loop that every software in Gemma 4 will observe, because it makes software calls:

  1. Outline features in Python to carry out precise duties (i.e., retrieve climate knowledge from an exterior supply, question a database, convert cash from one forex to a different).
  2. Create a JSON schema for every of the features you will have created. The schema ought to comprise the identify of the operate and what its parameters are (together with their sorts).
  3. When the system sends a message to you, you ship each the tool-schemas you will have created and the system’s message to the Ollama API.
  4. The Ollama API returns knowledge in a tool_calls block somewhat than plain textual content.
  5. You execute the operate utilizing the parameters despatched to you by the Ollama API.
  6. You come the outcome again to the Ollama API as a ‘function’:’software’ response.
  7. The Ollama API receives the outcome and returns the reply to you in pure language.

This two-pass sample is the inspiration for each function-calling AI agent, together with the examples proven under.

To execute these duties, you will have two elements: Ollama should be put in domestically in your machine, and you will have to obtain the Gemma 4 Edge 2B mannequin. There are not any dependencies past what is supplied with the usual set up of Python, so that you don’t want to fret about putting in Pip packages in any respect.

1. To put in Ollama with Homebrew or MacOS:

# Set up Ollama (macOS/Linux) 
curl --fail -fsSL https://ollama.com/set up.sh | sh 

2. To obtain the mannequin (which is roughly 2.5 GB):

# Obtain the Gemma 4 Edge Mannequin – E2B 
ollama pull gemma4:e2b

After downloading the mannequin, use the Ollama checklist to substantiate it exists within the checklist of fashions. Now you can hook up with the operating API on the URL http://localhost:11434 and run requests in opposition to it utilizing the helper operate we are going to create:

import json, urllib.request, urllib.parse
def call_ollama(payload: dict) -> dict:
    knowledge = json.dumps(payload).encode("utf-8")
    req = urllib.request.Request(
        "http://localhost:11434/api/chat",
        knowledge=knowledge,
        headers={"Content material-Sort": "utility/json"},
    )
    with urllib.request.urlopen(req) as resp:
        return json.masses(resp.learn().decode("utf-8"))

No third-party libraries are wanted; subsequently, the agent can run independently and offers full transparency.

Also learn: Methods to Run Gemma 4 on Your Cellphone: A Palms-On Information

Palms-on Activity 01: Dwell Climate Lookup

The primary of our strategies makes use of open-meteo that pulls stay knowledge for any location by a free climate API that doesn’t want a key with a view to pull the data right down to the native space based mostly on longitude/latitude coordinates. For those who’re going to make use of this API, you’ll have to carry out a collection of steps :

1. Write your operate in Python

def get_current_weather(metropolis: str, unit: str = "celsius") -> str:
    geo_url = f"https://geocoding-api.open-meteo.com/v1/search?identify={urllib.parse.quote(metropolis)}&rely=1"
    with urllib.request.urlopen(geo_url) as r:
        geo = json.masses(r.learn())
    loc = geo["results"][0]
    lat, lon = loc["latitude"], loc["longitude"] 
    url = (f"https://api.open-meteo.com/v1/forecast"
           f"?latitude={lat}&longitude={lon}"
           f"&present=temperature_2m,wind_speed_10m"
           f"&temperature_unit={unit}")
    with urllib.request.urlopen(url) as r:
        knowledge = json.masses(r.learn())
    c = knowledge["current"]
    return f"{metropolis}: {c['temperature_2m']}°, wind {c['wind_speed_10m']} km/h" 

2. Outline your JSON schema

This offers the data to the mannequin in order that Gemma 4 is aware of precisely what the operate will likely be doing/anticipating when it’s known as.

 weather_tool = { 

    "kind": "operate",
    "operate": {
        "identify": "get_current_weather",
        "description": "Get stay temperature and wind pace for a metropolis.",
        "parameters": {
            "kind": "object",
            "properties": {
                "metropolis": {"kind": "string", "description": "Metropolis identify, e.g. Mumbai"},
                "unit": {"kind": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["city"]
        }
    }

3. Create a question in your software name (in addition to deal with and course of the response again) 

messages = [{"role": "user", "content": "What's the weather in Mumbai right now?"}] response = call_ollama({"mannequin": "gemma4:e2b", "messages": messages, "instruments": [weather_tool], "stream": False}) msg = response["message"]
if "tool_calls" in msg: tc = msg["tool_calls"][0] fn = tc["function"]["name"] args = tc["function"]["arguments"] outcome = get_current_weather(**args) # executed domestically
messages.append(msg) 
messages.append({"function": "software", "content material": outcome, "identify": fn})
closing = call_ollama({"mannequin": "gemma4:e2b", "messages": messages, "instruments": [weather_tool], "stream": False}) 
print(closing["message"]["content"])

Output

Palms-on Activity 02: Dwell Foreign money Converter

The traditional LLM fails by hallucinating forex values and never having the ability to present correct, up-to-date forex conversion. With the assistance of ExchangeRate-API, the converter can get the most recent international alternate charges and convert precisely between two currencies.

When you full Steps 1-3 under, you’ll have a completely functioning converter in Gemma 4:

1. Write your Python operate

def convert_currency(quantity: float, from_curr: str, to_curr: str) -> str:
    url = f"https://open.er-api.com/v6/newest/{from_curr.higher()}"
    with urllib.request.urlopen(url) as r:
        knowledge = json.masses(r.learn())
    charge = knowledge["rates"].get(to_curr.higher())
    if not charge:
        return f"Foreign money {to_curr} not discovered."
    transformed = spherical(quantity * charge, 2)
    return f"{quantity} {from_curr.higher()} = {transformed} {to_curr.higher()} (charge: {charge})"

2. Outline your JSON schema 

currency_tool = { 

    "kind": "operate",
    "operate": {
        "identify": "convert_currency",
        "description": "Convert an quantity between two currencies at stay charges.",
        "parameters": {
            "kind": "object",
            "properties": {
                "quantity":    {"kind": "quantity", "description": "Quantity to transform"},
                "from_curr": {"kind": "string", "description": "Supply forex, e.g. USD"}, 
                "to_curr":   {"kind": "string", "description": "Goal forex, e.g. EUR"}
            },
            "required": ["amount", "from_curr", "to_curr"]
        } 
    }
} 

3. Take a look at your resolution utilizing a pure language question

response = call_ollama({
    "mannequin": "gemma4:e2b",
    "messages": [{"role": "user", "content": "How much is 5000 INR in USD today?"}],
    "instruments": [currency_tool],
    "stream": False
}) 

Gemma 4 will course of the pure language question and format a correct API name based mostly on quantity = 5000, from = ‘INR’, to = ‘USD’. The ensuing API name will then be processed by the identical ‘Suggestions’ methodology described in Activity 01.

Output

Gemma 4 excels at this job. You may provide the mannequin a number of instruments concurrently and submit a compound question. The mannequin coordinates all of the required calls in a single go; guide chaining is pointless.

1. Add the timezone software

def get_current_time(metropolis: str) -> str: 

    url = f"https://timeapi.io/api/Time/present/zone?timeZone=Asia/{metropolis}"
    with urllib.request.urlopen(url) as r:
        knowledge = json.masses(r.learn())
    return f"Present time in {metropolis}: {knowledge['time']}, {knowledge['dayOfWeek']} {knowledge['date']}"
time_tool = {
    "kind": "operate",
    "operate": {
        "identify": "get_current_time",
        "description": "Get the present native time in a metropolis.",
        "parameters": {
            "kind": "object",
            "properties": {
                "metropolis": {"kind": "string", "description": "Metropolis identify for timezone, e.g. Tokyo"}
            },
            "required": ["city"]
        }
    } 

2. Construct the multi-tool agent loop

TOOL_FUNCTIONS = { "get_current_weather": get_current_weather, "convert_currency": convert_currency, "get_current_time": get_current_time, } 

def run_agent(user_query: str): all_tools = [weather_tool, currency_tool, time_tool] messages = [{"role": "user", "content": user_query}] 

response = call_ollama({"mannequin": "gemma4:e2b", "messages": messages, "instruments": all_tools, "stream": False}) 
msg = response["message"] 
messages.append(msg) 
 
if "tool_calls" in msg: 
    for tc in msg["tool_calls"]: 
        fn     = tc["function"]["name"] 
        args   = tc["function"]["arguments"] 
        outcome = TOOL_FUNCTIONS[fn](**args) 
        messages.append({"function": "software]]]", "content material": outcome, "identify": fn}) 
 
    closing = call_ollama({"mannequin": "gemma4:e2b", "messages": messages, "instruments": all_tools, "stream": False}) 
    return closing["message"]["content"]
return msg.get("content material", "")

3. Execute a compound/multi-intent question

print(run_agent(
    "I am flying to Tokyo tomorrow. What is the present time there, "
    "the climate, and the way a lot is 10000 INR in JPY?"
))e

Output

Right here, we described three distinct features with three separate APIs in real-time by pure language processing utilizing one frequent idea. It contains all native execution with out cloud options from the Gemma 4 occasion; none of those elements make the most of any distant sources or cloud.

What Makes Gemma 4 Totally different for Agentic AI?

Different open weight fashions can name instruments, but they don’t carry out reliably, and that is what differentiates them from Gemma 4. The mannequin constantly offers legitimate JSON arguments, processes elective parameters accurately, and determines when to return information and never name a software. As you retain utilizing it, be mindful the next:

  • Schema high quality is critically vital. In case your description subject is obscure, you’ll have a troublesome time figuring out arguments in your software. Be particular with models, codecs, and examples.
  • The required array is validated by Gemma 4. Gemma 4 respects the wanted/elective distinction.
  • As soon as the software returns a outcome, that outcome turns into a context for any of the “function”: “software” messages you ship throughout your closing cross. The richer the outcome from the software, the richer the response will likely be.
  • A standard mistake is to return the software outcome as “function”: “person” as a substitute of “function”: “software”, because the mannequin is not going to attribute it accurately and can try and re-request the decision.

Also learn: Prime 10 Gemma 4 Initiatives That Will Blow Your Thoughts

Conclusion

You have got created an actual AI agent that makes use of the Gemma 4 function-calling function, and it’s working solely domestically. The agent-based system makes use of all of the elements of the structure in manufacturing. Potential subsequent steps can embody:

  • including a file system software that may permit for studying and writing native information on demand;
  • utilizing a SQL database as a method for making pure language knowledge queries;
  • making a reminiscence software that may create session summaries and write them to disk, thus offering the agent with the flexibility to recall previous conversations

The open-weight AI agent ecosystem is evolving shortly. The power for Gemma 4 to natively assist structured operate calling affords substantial autonomous performance to you with none reliance on the cloud. Begin small, create a working system, and the constructing blocks in your subsequent tasks will likely be prepared so that you can chain collectively.

 

Technical content material strategist and communicator with a decade of expertise in content material creation and distribution throughout nationwide media, Authorities of India, and personal platforms

Login to proceed studying and luxuriate in expert-curated content material.

Latest Articles

T-Mobile will give you an iPad for $99 when you sign...

Comply with ZDNET: Add us as a most well-liked supply on Google.If you happen to're out there for an iPad with...

More Articles Like This