In this podcast, we talk to quantum's enterprise products and solutions manager, tim sherbak, about the Impacts of Artificial Intelligence (AI) on data storageAnd in particular about the differenties of data storage over long periods and with very large volumes of data.

We talk about the technical requirements AI Places on StorageWhich could include the need for all-flash in a highly scalable Architecture and the need to aggregate throughput over multiple and single streams.

We also talk about the reality of “Forever Growth” and the need for “Forever Retention”, and how Organizations might optimise storage to copy with such demands.

In Particular, Sherbak mentions the use of Fair Principles – Findability, accessibility, interopeability and reuse – as a way of handling data in an open way that has been pionered in the Scientific Community.

Finally, we talk about how storage suppliers can leverage ai to help manage that those vast quantities of data across vast and divese data stores.



What impacts does ai processing brings to data storage?

AI processing has huge demands on the underlying data storage You have. Neural Networks are hugely computationally intensity. They take a large amount of data.

The basic challenge is feeding the elder. We've got massively powerful and expensive computer clusters [graphics processing units]And so the basic challenge is, how do we feed that data at a rate so they're running at full capacity all the time, just trust of the time Enormous Amount of Computational Analysis That's required. It's all about high throughput and low latency.

First off, that means that we need nvme [non-volatile memory express] And all-flash solutions. Second, these solutions tend to have a scale-out architecture You need seamless access to all the data in this flat namespace

In the current timeframe, there's a lot of focus on the Rdma Capability – Remote Direct Memory Access – Such that all the servers and storage nodes in this Cluster Have Direct access and visibility into the storage resources. This, too, can optimise storage access the Cluster. Then lastly, it's not just aggregate through's desirable, but also also single-store performance that is very important.

And so there are new architectures that have parallel data path clients that allow you to not only aggregate multiple streams, but also optimise each of that that individual streams by leveraging multipliple data Get the data to the gpus.

How can Organizations Manage Storage More Effectively, Given The Likely Impacts of Ai on Data, Data Retention, ETC?

With ai these days, there are two really clear problems.

One is that we've Got Forever Data Growth, and We've Got Forever Retention of the data that we're architecting into these solutions. And so there are enormous Amounts of Data Above and Beyond What's Being Calculated in the Context of Any Individual Run in a GPU Cluster.

That data needs to be preserved over the long term at a reasonable cost.

There are solutions on the market that are effectively a mess of flash, disk and tape, in order that you can optimise the cost of the solution as well as the performance of the performance of the Solutions by Having Quantities Across Three Mediums. By Doing that, you can right-size the performance and the cost-effectiveness of the solution you're using to store all this data over the long term.

The other thing I recommend to Organizations looking at how to solve this problem of forever and forever growing data is to look into the concept of fair data management. This concept has been Around for Six or Eight Years. It comes from the research side of the house in Organizations that are looking at how to how to cure their research, but also have also real impact and capability to help people as they look at their datsets as wll.

FAIR is an acronmy for findable, Assessable, Interopeable and Reusable. This is really a set of Principles [that allow] You [to] Measure your data management environment to make sure that as you evolve the data management infrastructure, you're measuring it against these principles [and] Doing the best job you can at curiating all this data. It's kind of like taking a little bit from library science and applying it into the digital age.

How can ai help with data storage for Ai?

That's a really interesting question.

I think that there are some basic Scenarios where as storage vendors collect data from their customers, they can optimise the operations and the supportability of the infrastructure Experience and the patterns of usage, etc, that we can use advanced algorithms to more effectively support customers.

But I Think Probably the most powerful application of Ai and data storage is this concept of self-aspaare storage or, likey more approval, self-aware data management. And the idea that we can catalogue rich metadata, data about the data we're storying, and we can use ai to do that cataloguing and pattern mapping.

As we grow these larger and larger datasets, ai will be able to auto-classify and self-documented the datasets in a variety of different ways. That will benefit Organizations from Being Able to More Quickly Leverage The Datasets that are at their disposal.

Just think in terms of an example like sports and how ai might be removed to easily document a team or a player's career just by reviewing all the player's film, Articles and Other Information that AI CAN HAWTIC to. And then a great player retirs or passes on, today without Documentary that they're Doing, but with Ai, We have more Opportunity to Gain Quicker access to that data.

Leave a Reply

Your email address will not be published. Required fields are marked *