I spent hours testing ChatGPT Tasks – and its refusal to follow directions was mildly terrifying

Duties is a brand new beta function for the paid-for variations of ChatGPT. This function permits you to schedule a immediate to run at a sure time. On this article, I am going to clarify that function. Then I am going to take you thru the extremely irritating strategy of making an attempt to get ChatGPT to do what you need it to do utilizing Duties.

I hesitate to anthropomorphize the AI, however on this spherical of testing, ChatGPT has been singularly uncooperative. Moderately than whining about it right here, let’s first dig into this new function.

How duties work in ChatGPT

Duties are prompts which are triggered at a given time limit. They will happen as soon as or repeat. For instance, you possibly can say, “At 10:30 a.m. tomorrow, inform me the present climate,” and ChatGPT will course of the immediate “inform me the present climate” at 10:30 a.m. tomorrow and both show a browser notification (you probably have that enabled) and/or ship you an electronic mail.

To allow duties, you want a Plus (or higher) paid account to ChatGPT, and you may want to pick the GPT-4o with scheduled duties mannequin. It additionally would not damage to have a superb therapist.

When you’re in that mannequin, you possibly can invoke the scheduling of duties in your immediate with one thing just like the “at” assertion or “schedule a activity” prefix. It looks like ChatGPT does a good job of decoding something that means a future time request as a activity.

I used to be capable of assign a activity in each the Mac app and the browser interface, however I used to be solely capable of see and handle current duties within the browser interface. Below the profile image on the proper of the display screen, you possibly can choose Duties from the drop-down menu.

That brings you to a duties display screen the place you possibly can see the duties you have scheduled and people which have been accomplished.

Hovering over the time will reveal a pencil and three dots. Pause prevents a activity from working however leaves it accessible to you. Delete removes it.

The pencil provides you an edit display screen that permits you to revise the duty earlier than it subsequent runs.

Right here you possibly can rename the duty, edit the immediate, and alter its scheduling.

So far as I can inform, these options sort of work pretty properly in beta. I had one activity that by no means executed, and one other one which executed ten hours after it was purported to, however most of them appear to have run as anticipated. I used to be capable of change the schedule and alter the immediate, so these options labored as properly.

Gateway drug to agentic AI

At first look, including duties to ChatGPT appears pretty uninteresting. In spite of everything, we have had very full and succesful activity managers for years. Actually, since ChatGPT Duties can solely notify you through a browser notification or an electronic mail, it is much less useful than, say, a activity supervisor that reminds you to get white spray paint while you pull into the ironmongery shop car parking zone.

However whereas Duties in ChatGPT does significantly lower than full-featured activity managers, it could possibly additionally do extra. It could run an AI immediate. Which means it could possibly take pretty clever motion mechanically at a selected time or instances sooner or later.

Proper now, the motion is restricted. It could course of a immediate, however its solely output is an electronic mail or browser notification. Nonetheless, it provides us an thought about how intelligence might be embedded right into a timed motion with what may be pretty little effort.

Besides, as I discussed earlier than, ChatGPT has been misbehaving throughout this complete experiment, which suggests I spent greater than a day making an attempt to get the AI to cooperate.

See, here is the factor. To exhibit this, I did not need to give ChatGPT a easy reminder to current. I needed to have it do one thing solely an AI might do, to point out how an AI performing a activity at a given time can be a substantial worth add over a scripting course of or simply line-item duties.

I do count on this to get higher over time. However for now, wow. After a day of this, I am cranky!

Trying to get a day by day information briefing

We have mentioned it earlier than and we’ll talk about it once more. AIs wish to make stuff up. In addition they observe instructions within the sense that they will reply to prompts in ways in which appear authoritative and assured however are utterly or subtly unsuitable.

I eat numerous information. Each morning, I scan a ton of web sites and information sources to get a really feel for what’s occurring on the earth. That is totally different from digging into press releases to see if there are any bulletins I need to take note of. What I need very first thing is to get a taste for what’s occurring on the market, what’s huge, and what could both catch the attention of my consideration or one thing I ought to concentrate on.

Relating to ChatGPT Duties, I assumed combining the agent service with ChatGPT net looking out had promise for this goal. It has promise. It simply refuses to do what I need.

I attempted to get ChatGPT to offer me present information tales and sources. Typically, it simply made them up. Typically, it gave me sources and tales from a 12 months in the past. Typically it cited tales that supposedly got here from one web site however got here from utterly totally different websites. Some hyperlinks that mentioned they had been about one matter really pointed someplace solely totally different.

And I actually tried. I attempted to get ChatGPT to validate its sources. I attempted to get it to double-check its work. I attempted to slender down its selections or present extra clear and particular directions. I labored it.

My conclusion is that this: ChatGPT is ready to search the net. And it is ready to discover some subjects. However if you’d like at this time’s information and also you need it verifiable (by way of it being an precise story with an precise hyperlink), ChatGPT will not be prepared for prime time.

Producing a customized climate briefing

My subsequent try was to get a day by day climate briefing. Once more, I needed one thing greater than only a fast climate report. I’ve a climate widget on my desktop and might see the climate particulars every time I need.

As a substitute, I needed ChatGPT so as to add some worth to the climate. I needed it to attract an image representing the climate on the time the immediate was executed.

Earlier than trying to assign a immediate to a future time, I first labored by means of and refined the primary immediate itself. That is vital. Be sure you have a immediate that works earlier than unleashing it on the scheduling agent.

I needed a properly formatted briefing, together with that consultant image. After numerous refinement rounds, here is what I received.

Good, huh? That is the state capitol constructing right here in Salem, Oregon. Right here is the immediate I used to create this custom-made climate briefing.

Carry out the next steps strictly and output outcomes sequentially:

Print a line containing the textual content: ‘Your day by day climate transient’ in heading 2 daring letters.
Generate a DALL-E picture that visually represents at this time’s climate in Salem, Oregon. The picture ought to embody parts related to the climate (e.g., rain, sunny skies) and a recognizable landmark just like the Oregon State Capitol. Instantly show the picture.
Print a heading: ‘At this time’s climate’ adopted by the climate situation and temperature for Salem, Oregon, at this time.
Print a heading: ‘Dawn/sundown’ adopted by the dawn and sundown instances for Salem, Oregon, at this time
Print a heading: ‘Air high quality’ adopted by the air high quality for Salem, Oregon, at this time
Print a heading: ‘Advisories’ adopted by any advisories for Salem, Oregon, at this time. If there are not any advisories, show ‘No advisories at this time’
Print a heading: ‘Commute’ adopted by any suggestions for commuting in Salem, Oregon, at this time, significantly primarily based on weather-related points.
Print a heading: ‘Out of doors actions’ adopted by any suggestions for outside actions in Salem, Oregon, primarily based on at this time’s climate

Don’t proceed to the subsequent step till the earlier one is full. All the time retry picture technology if it fails.

It took me a superb couple of hours to get ChatGPT to do that reliably. Notice the primary line, the place I am telling it to “carry out the steps strictly” and “output outcomes sequentially.” Using “strictly” was really advisable by ChatGPT once I requested it why it wasn’t really following the instructions.

I ran right into a bunch of issues making an attempt to get the image to generate. Step 2 clearly says to make use of DALL-E. I discovered that “visually represents” satisfied the AI to make use of present situations with the theme to provide a newly created picture. I additionally had it embody a landmark, as a result of all the opposite pictures it generated had been largely of small cities with huge timber, like this one.

It additionally confused Celsius and Fahrenheit. 36 levels C would have been virtually 97 levels F. For a chilly January day, that is a mistake. And, after all, “droize.” Though, I’ve to say, dwelling in Oregon, the climate right here actually does really feel like “droize.” So factors to DALL-E for making up a phrase that basically does signify the way it feels on the market.

Lastly, I had a tough time at all times getting ChatGPT to generate the image in any respect. I discovered the ultimate instruction of “Don’t proceed to the subsequent step till the earlier one is full. All the time retry picture technology if it fails,” appeared to beat the issue.

So, by this time, I had a immediate that labored reliably in ChatGPT. It was time to unleash it as a scheduled activity.

Agentifying the duty

To do that, all I did was add “At 9:30am at this time” to the start of the immediate. To make it repeat, simply substitute “at this time” with “daily.”

Then, proper on time, there was an electronic mail in my inbox.

I clicked View message and received the output on the left. Discover that it says 50 levels — however our native temps did not get above 40 at this time. Nonetheless, it is a good image.

Also discover that the AI determined so as to add the phrase “step” with every step quantity to every part of my beforehand good customized output. I did a second run with the very same immediate and received the model on the fitting.

I then spent the subsequent three hours making an attempt to persuade ChatGPT to not embody the steps. Typically I received an image. Typically I did not. Typically I received a full forecast, different instances I did not. As soon as, I simply received again the total immediate. As soon as I simply received again the topic of the e-mail message, however no particulars.

So, yeah…

Not prepared for prime time

To be honest, OpenAI does label this function as beta. And boy-oh-boy, is it beta. On one hand, the thought of an AI agent with the ability to do issues like draw a consultant image of a sure set of information appears intriguing. Then again, an AI agent that refuses to observe instructions and goes off on all kinds of tangents appears terrifying.

No less than with non-AI algorithms, if our code goes off the rails, it is our fault as programmers. However relating to AI-based brokers, you actually cannot topic your agentic operations to finish check suites as a result of the AI will carry out in another way primarily based on the info it will get, the part of the moon, and its temper. That is an exaggeration, however in all probability not by a lot.

We now have many of the items to do that. Because the AIs get higher and higher (we will solely hope, proper?), we must always have the ability to launch little brokers that may assemble day by day briefings.

However AI brokers that management machines, the Web of Issues, safety, weapons, and different worrisome real-world operations? I am unsure I’ll be behind that till we will show we have now way more full management over the AIs than we’re seeing right here.

In any other case, a immediate like “management my dwelling surroundings so I can sleep by means of the evening” might properly consequence within the AIs killing us whereas we sleep as their approach of enthusiastically following our instructions.

I actually want tech would cease giving me that squidgy feeling in the back of my neck. What about you? Are you trying ahead to making an attempt out ChatGPT Duties or are you extra satisfied than ever that we must always go stay in a yurt within the woods? Tell us within the feedback beneath.

You possibly can observe my day-to-day mission updates on social media. Make sure you subscribe to my weekly replace publication, and observe me on Twitter/X at @DavidGewirtz, on Fb at Fb.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.

I spent hours testing ChatGPT Tasks – and its refusal to follow directions was mildly terrifying

How duties work in ChatGPT

Gateway drug to agentic AI

Trying to get a day by day information briefing

Producing a customized climate briefing

Agentifying the duty

Not prepared for prime time

Related Posts:

A data removal service helped me reclaim my privacy – see...

Anthropic co-founder confirms the company briefed the Trump administration on Mythos

In just a couple weeks, StrictlyVC San Francisco brings leaders from...

I followed the ‘Plus 5’ rule for wireless charging, and it...

Vercel CEO Guillermo Rauch signals IPO readiness as AI agents fuel...

More Articles Like This

Topics

Stay connected

Legal Pages

Top Tags List

About Us