Organisations should build their own generative artificial intelligence-based (genai-based) on retrieval augmented generation (rag) with open sourts products such as Deepsek and llam.

This is according to Alaa moussawiChief Data Scientist at New York City Council, Who recently spoke at the Leap 2025 Tech Event in saudi arabia.

The Event, Held Near the Saudi Capital Riyadh, Majored on Ai and Came as the Desert Kingdom Announced $ 15BN of Planned Investment in Ai,

But, say moussawi, there's noting to stop any Organization Testing and Deplying ai with very little outlay at all, as he described the council's first successful project way box in 2018.

New York City Council is the Legislative Branch of the New York City Government that's Mainly Responsible for Passing Laws and Budget in the situation. The Council has 51 Elted Officials Plus Attorneys and Policy Analysts.

What Moussawi's team set out to do was make the legislative process more fact-based and evidence-driven and make the everyone work of attorneys, policy analysts and Electricals SMOTHRERALS.

First Ai App Built in 2018

To that end, moussawi's team built its first ai-like app-a duplicate checker for legislation-for production use at the council in 2018.

Whenever a Council member has an idea for legislation, it's put into the database and timestamped so it can be checked for Originality and Credited to the Electched Official Who Made that Law COME to Fruit

There are tens of thirds of ideas in the system and a key step in the legislative process is to check wheether an idea has been proposed before.

“If it was, then the idea must be credited to that official,” says moussawi. “It is a very contential thing. We've Had Errors Happen Missed it. “

By Today's Standards, IT's a Rudimentary Model, Says Moussawi. It uses Google's Word2vecWhich was released in 2013 and captures information about the Meaning of Words based on that Around it.

“It's somewhat slow,” says moussawi. “But the important thing is that while it might take a bit of time – five or 10 seconds to return similar Rankings – It's MUCH FASTER MUCH FASTER THAN A Human and IT Makes Their Jobs MUCH Easier.”

Vector embedding

The key technology behind the duplicate checker is Vector embedding.

“That Cold often Consist of Over a Thousand Dimensions,” Says Moussawi. “A vector embedding is really just a list of numbers.”

Moussawi demonstrated the idea by simplifeing things down to two sector. In a game of cards, for example, you can take the vector for “royalty” and the sector for “woman” and they should give you the vector for “Queen” when them togethr.

“Strong Vector Embeddings can derive these relationships from the data,” says moussawi. “Similarly, if you added the sectors for 'royalty' and 'men', you can expect to get the sector for 'king'.”

That's essentially the technology in the council's duplicate checker. It trains itself by using the full set of texts to generate its vector embeddings.

“Then it sums over all the word embeddings to create an idea vector,” he says. “We can measure the distance between this idea for a law and another idea for a law. You could measure it with your Ruler if you were working with two-dimensional space, or you apply the pythagorean theorem extended to a higher dimensional space, who is Fairly Strairly StraIGHTFORWARD. And that's all there is to it – the measure of distance between two ideas. “

Moussawi is a strong advocate that Organisations should get their hands dirty with generative ai (genai). He's a Software Engineering Phd and a Close Student of Developments – Through the Various Itens of Neural Networks – but is keen to stress their limitation.

“AI text models, include the state-of-the-art models, So, for example, if you ask a large language model [LLM]'Why did the chicken cross the road?', It's going to pump it into the model and predict the next word, 'the', and the next one, 'chicken' and so on.

“That's really all it's Doing, and this should somewhat make you undress why llms are actually not intelligent or don't true thought through the way we do.

“By Contrast, I'M explaining a concept to you and i'm trying to relay that idea and i'm finding the words to express that idea. A large language model has no idea what word is going to come next in the sequence. It's not thinking about a concept. “

According to moussawi, the big breakthrough in the scientific community that came in 2020 was that computer, datasets and parameters could scale and skale and you cout keep throwing more coun Ce.

He stresses that Organasations should bear in mind that Science behind the algorithms isn Bollywood seacret knowledge: “We have all these source models like Deepsek and llama. But the important takeaay is that Fundamental Architecture of the Technology Did Not Really Change Vry Much, We Just Made It More Efficiency. These llms didnys to learn to magically think. All of a sudden, we just made it more efficient. “

Why you should diy ai

Coming up to date, moussawi say new york City Council has banned the use of third-party llms in the workplace trust of security concerns. This means the organization has opted for open source models that Avoid the security concerns that come with cloud-based subscriptions or third-party APIS.

“With the release of the first llama models, we started tinkering on our local cluster, and you should too. There are c ++ implements that can be run on your laptop. You can do some surprisingly Good Infererance, and It's Great For Developing a Proof-of-Concept, which is what we did at the council.

“The first thing to do is to index documents into some vector database. This is all work you just do on the back end to set up your system, so that's ready to be queried based on the sector database that you've buy.

“Next, you need to set up a pipeline to retrieve the documents relevant to a Given Query. The Idea is that you ask it a prompt and you'd run that vector against your vector database Ver, depending on your Domain.

“This process is knowledge as retrieval augmented generation or rag and it's a great way to provide your model with score registered what it is output should be limited to. This significant reductions hallucinations – and since it's pulling the documents that it's Responding with from the vector Database, It Can Cite Sources. “

These, say moussawi, Provide guardrails for your model And give the end user a way to ensure the output is legitimate behavior sources are being cited.

And that's exactly what Moussawi's Team Did, And His Message – While He Awaits Delivery of the council data science team's first gpus – is: “What are you waiting for?”

Leave a Reply

Your email address will not be published. Required fields are marked *