• A small comparison of L3 cache in games and applications. What is cache, why is it needed and how does it work? What page volume is used for the processor?

    What is the dirtiest place on a computer? Do you think it's a basket? User folders? Cooling system? You guessed wrong! The dirtiest place is the cache! After all, you have to constantly clean it!

    In fact, there are a lot of caches on a computer, and they serve not as a waste dump, but as accelerators for equipment and applications. Where did they get the reputation of being a “system trash chute”? Let's figure out what a cache is, what it is, how it works and why from time to time.

    Concept and types of cache memory

    Cache or cache memory is a special storage of frequently used data, which is accessed tens, hundreds and thousands of times faster than RAM or other storage media.

    Applications (web browsers, audio and video players, database editors, etc.), operating system components (thumbnail cache, DNS cache) and hardware (CPU cache L1-L3, graphics framebuffer) have their own cache memory. chip, storage buffers). It is implemented in different ways – in software and hardware.

    • A program cache is simply a separate folder or file into which, for example, pictures, menus, scripts, multimedia content and other contents of visited sites are loaded. It is this folder that the browser first dives into when you reopen a web page. Paging part of the content from local storage speeds up its loading and .
    • In storage devices (in particular, hard drives), the cache is a separate RAM chip with a capacity of 1-256 Mb, located on the electronics board. It receives information read from the magnetic layer and not yet loaded into RAM, as well as data that is most often requested by the operating system.
    • A modern central processor contains 2-3 main levels of cache memory (also called ultra-random access memory), located in the form of hardware modules on the same chip. The fastest and smallest in size (32-64 Kb) is cache Level 1 (L1) - it operates at the same frequency as the processor. L2 occupies an average position in speed and capacity (from 128 Kb to 12 Mb). And L3 is the slowest and most voluminous (up to 40 Mb), and is absent on some models. The speed of L3 is low only relative to its faster brothers, but it is also hundreds of times faster than the most productive RAM.

    The processor's flash memory is used to store constantly used data pumped from RAM and machine code instructions. The more it is, the faster the processor.

    Today, three levels of caching are no longer the limit. With the advent of the Sandy Bridge architecture, Intel implemented an additional cache L0 (intended for storing decrypted microinstructions) in its products. And the most high-performance CPUs also have a fourth-level cache, made in the form of a separate chip.

    Schematically, the interaction of cache levels L0-L3 looks like this (using the example of Intel Xeon):

    In human language about how it all works

    To understand how cache memory functions, let's imagine a person working at a desk. Folders and documents that he uses constantly are on the table ( in cache memory). To access them, just stretch out your hand.

    Papers that he needs less often are stored nearby on shelves ( in RAM). To get them, you need to stand up and walk a few meters. And what a person does not currently work with is archived ( recorded to hard drive).

    The wider the table, the more documents will fit on it, which means the employee will be able to quickly access more information ( The larger the cache capacity, the faster the program or device works, in theory.).

    Sometimes he makes mistakes - he keeps papers on his desk that contain incorrect information and uses them in his work. As a result, the quality of his work decreases ( Cache errors lead to program and hardware failures). To correct the situation, the employee must throw out documents with errors and put the correct ones in their place ( clear cache).

    The table has a limited area ( cache memory has a limited capacity). Sometimes it can be expanded, for example, by moving a second table, and sometimes it cannot (the cache size can be increased if such a possibility is provided by the program; the hardware cache cannot be changed, since it is implemented in hardware).

    Another way to speed up access to more documents than the desk can accommodate is to have an assistant serve the worker papers from the shelf (the operating system can allocate some of the unused RAM to cache device data). But it's still slower than taking them from the table.

    The documents at hand should be relevant to current tasks. The employee himself must monitor this. You need to put things in order regularly (removing irrelevant data from cache memory falls on the shoulders of applications that use it; some programs have an automatic cache clearing function).

    If an employee forgets to maintain order in the workplace and keep documentation up to date, he can draw himself a desk cleaning schedule and use it as a reminder. As a last resort, entrust this to an assistant (if an application dependent on cache memory has become slower or often downloads irrelevant data, use cache cleaning tools on a schedule or perform this manipulation manually every few days).

    We actually come across “caching functions” everywhere. This includes buying groceries for future use, and various actions that we perform in passing, at the same time, etc. Essentially, this is all that saves us from unnecessary fuss and unnecessary movements, streamlines our life and makes our work easier. The computer does the same. In short, if there were no cache, it would work hundreds and thousands of times slower. And we probably wouldn't like it.

    One of the important factors that increases processor performance is the presence of cache memory, or rather its volume, access speed and distribution among levels.

    For quite some time now, almost all processors have been equipped with this type of memory, which once again proves the usefulness of its presence. In this article, we will talk about the structure, levels and practical purpose of cache memory, which is very important. processor characteristics.

    What is cache memory and its structure

    Cache memory is ultra-fast memory used by the processor to temporarily store data that is most frequently accessed. This is how we can briefly describe this type of memory.

    Cache memory is built on flip-flops, which, in turn, consist of transistors. A group of transistors takes up much more space than the same capacitors that make up the RAM. This entails many difficulties in production, as well as limitations in volume. That is why cache memory is a very expensive memory, while having negligible volumes. But from this structure comes the main advantage of such memory - speed. Since flip-flops do not need regeneration, and the delay time of the gate on which they are assembled is small, the time for switching the flip-flop from one state to another occurs very quickly. This allows the cache memory to operate at the same frequencies as modern processors.

    Also, an important factor is the placement of the cache memory. It is located on the processor chip itself, which significantly reduces access time. Previously, cache memory of some levels was located outside the processor chip, on a special SRAM chip somewhere on the motherboard. Now, almost all processors have cache memory located on the processor chip.


    What is processor cache used for?

    As mentioned above, the main purpose of cache memory is to store data that is frequently used by the processor. The cache is a buffer into which data is loaded, and despite its small size (about 4-16 MB) modern processors, it gives a significant performance boost in any application.

    To better understand the need for cache memory, let's imagine organizing a computer's memory like an office. The RAM will be a cabinet with folders that the accountant periodically accesses to retrieve large blocks of data (that is, folders). And the table will be a cache memory.

    There are elements that are placed on the accountant’s desk, which he refers to several times over the course of an hour. For example, these could be phone numbers, some examples of documents. These types of information are located right on the table, which, in turn, increases the speed of access to them.

    In the same way, data can be added from those large data blocks (folders) to the table for quick use, for example, a document. When this document is no longer needed, it is placed back in the cabinet (into RAM), thereby clearing the table (cache memory) and freeing this table for new documents that will be used in the next period of time.

    Also with cache memory, if there is any data that is most likely to be accessed again, then this data from RAM is loaded into cache memory. Very often, this happens by co-loading the data that is most likely to be used after the current data. That is, there are assumptions about what will be used “after”. These are the complex operating principles.

    Processor cache levels

    Modern processors are equipped with a cache, which often consists of 2 or 3 levels. Of course, there are exceptions, but this is often the case.

    In general, there can be the following levels: L1 (first level), L2 (second level), L3 (third level). Now a little more detail on each of them:

    First level cache (L1)– the fastest cache memory level that works directly with the processor core, thanks to this tight interaction, this level has the shortest access time and operates at frequencies close to the processor. It is a buffer between the processor and the second level cache.

    We will consider volumes on a high-performance processor Intel Core i7-3770K. This processor is equipped with 4x32 KB L1 cache 4 x 32 KB = 128 KB. (32 KB per core)

    Second level cache (L2)– the second level is larger-scale than the first, but as a result, has lower “speed characteristics”. Accordingly, it serves as a buffer between the L1 and L3 levels. If we look again at our example Core i7-3770 K, then the L2 cache memory size is 4x256 KB = 1 MB.

    Level 3 cache (L3)– the third level, again, is slower than the previous two. But it is still much faster than RAM. The L3 cache size in the i7-3770K is 8 MB. If the previous two levels are shared by each core, then this level is common to the entire processor. The indicator is quite solid, but not exorbitant. Since, for example, for Extreme-series processors like the i7-3960X, it is 15 MB, and for some new Xeon processors, more than 20.

    It's not about cash, it's about cache-processor memory and more. From volume cache-memory traders have made another commercial fetish, especially with the cache of central processors and hard drives (video cards also have it - but they haven’t gotten to it yet). So, there is a XXX processor with a 1MB L2 cache, and exactly the same XYZ processor with a 2MB cache. Guess which one is better? Ah - don’t do it right away!

    Cache-memory is a buffer that stores what can and/or needs to be put aside for later. The processor is doing work and situations arise when intermediate data needs to be stored somewhere. Well, of course in the cache! - after all, it is orders of magnitude faster than RAM, because... it is in the processor die itself and usually runs at the same frequency. And then, after some time, he will fish this data back and process it again. Roughly speaking, it’s like a potato sorter on a conveyor belt, who, every time he comes across something other than potatoes (carrots), throws it into a box. And when it’s full, he gets up and takes it out his to the next room. At this moment, the conveyor is standing still and downtime is observed. The volume of the box is cache in this analogy. AND How many his Do you need 1MB or 12? It is clear that if his the volume is small you will have to spend too much time on removal and it will be simple, but from a certain volume his further increase will do nothing. Well, the sorter will have a box for 1000 kg of carrots - but he won’t have that much during his entire shift and because of this he WILL NOT BECOME TWO TIMES FASTER! There is one more subtlety - big cache may cause an increase in delays in accessing it, firstly, and at the same time the likelihood of errors in it increases, for example during overclocking - secondly. (about HOW to determine the stability/instability of the processor in this case and find out that the error occurs specifically in his cache, test L1 and L2 - you can read here.) Thirdly - cache eats up a decent amount of chip area and the transistor budget of the processor circuit. The same goes for cache hard drive memory. And if the processor architecture is strong, it will have a cache of 1024 KB or more in demand in many applications. If you have a fast HDD, 16MB or even 32MB are appropriate. But no amount of 64MB of cache will do it his faster if it is a trim called the green version (Green WD) with a speed of 5900 instead of the required 7200, even if the latter has 8MB. Then Intel and AMD processors use this differently cache(generally speaking AMD is more efficient and their processors are often comfortable with lower values). In addition, Intel cache general, but for AMD it is personal for each core. The fastest cache L1 for AMD processors is 64 KB for data and instructions, which is twice as much as for Intel. Cache third level L3 is usually present in top processors like AMD Phenom II 1055T X6 Socket AM3 2.8GHz or a competitor Intel Core i7-980X. First of all, games love large cache volumes. AND cache Many professional applications do NOT like it (see Computer for rendering, video editing and professional applications). More precisely, those who are most demanding are generally indifferent to him. But what you definitely shouldn’t do is choose a processor based on cache size. The old Pentium 4 in its latest manifestations also had 2 MB of cache at operating frequencies well over 3 GHz - compare his performance with a cheap dual-core Celeron E1*** operating at frequencies of about 2 GHz. He will leave no stone unturned from the old man. A more current example is the high-frequency dual-core E8600, which costs almost $200 (apparently due to the 6MB cache) and the Athlon II X4-620 2.6GHz, which has only 2MB. This does not prevent Athlone from butchering his competitor.

    As you can see from the graphs, there is no cache will not replace additional cores. Athlon with 2MB cache (red) easily beats Cor2Duo with 6MB cache, even at a lower frequency and almost half the cost. Many people also forget that cache is present in video cards because, generally speaking, they also have processors. A recent example is the GTX460 video card, where they manage not only to cut the bus and memory capacity (which the buyer will guess about) - but also CACHE shaders, respectively, from 512Kb to 384Kb (which the buyer will NOT guess about). And this will also add its negative contribution to productivity. It will also be interesting to find out the dependence of performance on cache size. Let's examine how quickly it grows with increasing cache size using the example of the same processor. As you know, processors of the E6***, E4*** and E2*** series differ only in the cache size (4, 2 and 1 MB each, respectively). Operating at the same frequency of 2400 MHz, they show the following results.

    As you can see, the results are not too different. I will say more - if a processor with a capacity of 6MB had been involved, the result would have increased a little more, because processors reach saturation. But for models with 512Kb the drop would be noticeable. In other words, 2MB is enough even for games. To summarize, we can draw the following conclusion: cache it's good when there is ALREADY a lot of everything else. It is naive and stupid to change the speed of the hard drive or the number of processor cores for the cache size at the same cost, because even the most capacious sorting box will not replace another sorter. But there are also good examples.. For example, Pentium Dual-Core in an early revision on the 65 nm process had 1MB of cache for two cores (E2160 series and similar), and the later 45-nm revision of the E5200 series still has 2MB, all other things being equal (and most importantly - PRICE). Of course, you should choose the latter.

    One of the important factors that increases processor performance is the presence of cache memory, or rather its volume, access speed and distribution among levels.

    For quite some time now, almost all processors have been equipped with this type of memory, which once again proves the usefulness of its presence. In this article, we will talk about the structure, levels and practical purpose of cache memory, which is very important. processor characteristics.

    What is cache memory and its structure

    Cache memory is ultra-fast memory used by the processor to temporarily store data that is most frequently accessed. This is how we can briefly describe this type of memory.

    Cache memory is built on flip-flops, which, in turn, consist of transistors. A group of transistors takes up much more space than the same capacitors that make up the RAM. This entails many difficulties in production, as well as limitations in volume. That is why cache memory is a very expensive memory, while having negligible volumes. But from this structure comes the main advantage of such memory - speed. Since flip-flops do not need regeneration, and the delay time of the gate on which they are assembled is small, the time for switching the flip-flop from one state to another occurs very quickly. This allows the cache memory to operate at the same frequencies as modern processors.

    Also, an important factor is the placement of the cache memory. It is located on the processor chip itself, which significantly reduces access time. Previously, cache memory of some levels was located outside the processor chip, on a special SRAM chip somewhere on the motherboard. Now, almost all processors have cache memory located on the processor chip.


    What is processor cache used for?

    As mentioned above, the main purpose of cache memory is to store data that is frequently used by the processor. The cache is a buffer into which data is loaded, and despite its small size (about 4-16 MB) modern processors, it gives a significant performance boost in any application.

    To better understand the need for cache memory, let's imagine organizing a computer's memory like an office. The RAM will be a cabinet with folders that the accountant periodically accesses to retrieve large blocks of data (that is, folders). And the table will be a cache memory.

    There are elements that are placed on the accountant’s desk, which he refers to several times over the course of an hour. For example, these could be phone numbers, some examples of documents. These types of information are located right on the table, which, in turn, increases the speed of access to them.

    In the same way, data can be added from those large data blocks (folders) to the table for quick use, for example, a document. When this document is no longer needed, it is placed back in the cabinet (into RAM), thereby clearing the table (cache memory) and freeing this table for new documents that will be used in the next period of time.

    Also with cache memory, if there is any data that is most likely to be accessed again, then this data from RAM is loaded into cache memory. Very often, this happens by co-loading the data that is most likely to be used after the current data. That is, there are assumptions about what will be used “after”. These are the complex operating principles.

    Processor cache levels

    Modern processors are equipped with a cache, which often consists of 2 or 3 levels. Of course, there are exceptions, but this is often the case.

    In general, there can be the following levels: L1 (first level), L2 (second level), L3 (third level). Now a little more detail on each of them:

    First level cache (L1)– the fastest cache memory level that works directly with the processor core, thanks to this tight interaction, this level has the shortest access time and operates at frequencies close to the processor. It is a buffer between the processor and the second level cache.

    We will consider volumes on a high-performance processor Intel Core i7-3770K. This processor is equipped with 4x32 KB L1 cache 4 x 32 KB = 128 KB. (32 KB per core)

    Second level cache (L2)– the second level is larger-scale than the first, but as a result, has lower “speed characteristics”. Accordingly, it serves as a buffer between the L1 and L3 levels. If we look again at our example Core i7-3770 K, then the L2 cache memory size is 4x256 KB = 1 MB.

    Level 3 cache (L3)– the third level, again, is slower than the previous two. But it is still much faster than RAM. The L3 cache size in the i7-3770K is 8 MB. If the previous two levels are shared by each core, then this level is common to the entire processor. The indicator is quite solid, but not exorbitant. Since, for example, for Extreme-series processors like the i7-3960X, it is 15 MB, and for some new Xeon processors, more than 20.

    One of the important factors that increases processor performance is the presence of cache memory, or rather its volume, access speed and distribution among levels.

    For quite some time now, almost all processors have been equipped with this type of memory, which once again proves the usefulness of its presence. In this article, we will talk about the structure, levels and practical purpose of cache memory, as a very important characteristic of the processor.

    What is cache memory and its structure

    Cache memory is ultra-fast memory used by the processor to temporarily store data that is most frequently accessed. This is how we can briefly describe this type of memory.

    Cache memory is built on flip-flops, which, in turn, consist of transistors. A group of transistors takes up much more space than the same capacitors that make up the RAM. This entails many difficulties in production, as well as limitations in volume. That is why cache memory is a very expensive memory, while having negligible volumes. But from this structure comes the main advantage of such memory - speed. Since flip-flops do not need regeneration, and the delay time of the gate on which they are assembled is small, the time for switching the flip-flop from one state to another occurs very quickly. This allows the cache memory to operate at the same frequencies as modern processors.

    Also, an important factor is the placement of the cache memory. It is located on the processor chip itself, which significantly reduces access time. Previously, cache memory of some levels was located outside the processor chip, on a special SRAM chip somewhere on the motherboard. Now, almost all processors have cache memory located on the processor chip.

    What is processor cache used for?

    As mentioned above, the main purpose of cache memory is to store data that is frequently used by the processor. The cache is a buffer into which data is loaded, and despite its small size (about 4-16 MB) in modern processors, it provides a significant performance boost in any application.

    To better understand the need for cache memory, let's imagine organizing a computer's memory like an office. The RAM will be a cabinet with folders that the accountant periodically accesses to retrieve large blocks of data (that is, folders). And the table will be a cache memory.

    There are elements that are placed on the accountant’s desk, which he refers to several times over the course of an hour. For example, these could be phone numbers, some examples of documents. These types of information are located right on the table, which, in turn, increases the speed of access to them.

    In the same way, data can be added from those large data blocks (folders) to the table for quick use, for example, a document. When this document is no longer needed, it is placed back in the cabinet (into RAM), thereby clearing the table (cache memory) and freeing this table for new documents that will be used in the next period of time.

    Also with cache memory, if there is any data that is most likely to be accessed again, then this data from RAM is loaded into cache memory. Very often, this happens by co-loading the data that is most likely to be used after the current data. That is, there are assumptions about what will be used “after”. These are the complex operating principles.

    Processor cache levels

    Modern processors are equipped with a cache, which often consists of 2 or 3 levels. Of course, there are exceptions, but this is often the case.

    In general, there can be the following levels: L1 (first level), L2 (second level), L3 (third level). Now a little more detail on each of them:

    The first level cache (L1) is the fastest cache memory level that works directly with the processor core. Thanks to this tight interaction, this level has the shortest access time and operates at frequencies close to the processor. It is a buffer between the processor and the second level cache.

    We will consider volumes on a high-performance processor Intel Core i7-3770K. This processor is equipped with 4x32 KB L1 cache 4 x 32 KB = 128 KB. (32 KB per core)

    Second level cache (L2) – the second level is larger than the first, but as a result, has lower “speed characteristics”. Accordingly, it serves as a buffer between the L1 and L3 levels. If we look again at our example Core i7-3770 K, then the L2 cache memory size is 4x256 KB = 1 MB.

    Third level cache (L3) – the third level, again, is slower than the previous two. But it is still much faster than RAM. The L3 cache size in the i7-3770K is 8 MB. If the previous two levels are shared by each core, then this level is common to the entire processor. The indicator is quite solid, but not exorbitant. Since, for example, for Extreme-series processors like the i7-3960X, it is 15 MB, and for some new Xeon processors, more than 20.

    we-it.net

    What is cache used for and how much is needed?

    We are not talking about cash, but about processor cache memory and more. The traders have made another commercial fetish out of the cache memory capacity, especially with the cache of central processors and hard drives (video cards also have it, but they haven’t gotten to it yet). So, there is a XXX processor with a 1MB L2 cache, and exactly the same XYZ processor with a 2MB cache. Guess which one is better? Ah - don’t do it right away!

    Cache memory is a buffer that stores what can and/or needs to be postponed for later. The processor is doing work and situations arise when intermediate data needs to be stored somewhere. Well, of course in the cache! - after all, it is orders of magnitude faster than RAM, because... it is in the processor die itself and usually runs at the same frequency. And then, after some time, he will fish this data back and process it again. Roughly speaking, it’s like a potato sorter on a conveyor belt, who, every time he comes across something other than potatoes (carrots), throws it into a box. And when it’s full, he gets up and takes it into the next room. At this moment, the conveyor is standing still and downtime is observed. The volume of the box is the cache in this analogy. And how much is needed – 1 MB or 12? It is clear that if its volume is small, you will have to spend too much time on removal and it will be simple, but after a certain volume, further increasing it will not yield anything. Well, the sorter will have a box for 1000 kg of carrots - but he won’t have that much during his entire shift and because of this he WILL NOT BECOME TWO TIMES FASTER! There is one more subtlety - a large cache can cause an increase in delays in accessing it, firstly, and at the same time the likelihood of errors in it increases, for example during overclocking - secondly. (You can read about HOW to determine the stability/instability of a processor in this case and find out that the error occurs in its cache and test L1 and L2 here.) Thirdly, the cache eats up a decent amount of chip area and the transistor budget of the processor circuit. The same applies to the cache memory of hard drives. And if the processor architecture is strong, it will have a cache of 1024 KB or more in demand in many applications. If you have a fast HDD, 16MB or even 32MB are appropriate. But no amount of 64MB of cache will make it faster if it is a trim called the green version (Green WD) with a speed of 5900 instead of the required 7200, even if the latter has 8MB. Then Intel and AMD processors use this cache differently (generally speaking, AMD is more efficient and their processors are often comfortable with smaller values). In addition, Intel has a shared cache, but AMD has it individually for each core. The fastest L1 cache on AMD processors is 64 KB for data and instructions, which is twice as much as that of Intel. The third level L3 cache is usually present in top processors like the AMD Phenom II 1055T X6 Socket AM3 2.8GHz or the competitor Intel Core i7-980X. First of all, games love large cache volumes. And many professional applications do NOT like cache (see. Computer for rendering, video editing and professional applications). More precisely, those who are most demanding are generally indifferent to him. But what you definitely shouldn’t do is choose a processor based on cache size. The old Pentium 4 in its latest manifestations had 2MB of cache at operating frequencies well over 3GHz - compare its performance with the cheap dual-core Celeron E1***, operating at frequencies of about 2GHz. He will leave no stone unturned from the old man. A more current example is the high-frequency dual-core E8600, which costs almost $200 (apparently due to the 6MB cache) and the Athlon II X4-620 2.6GHz, which has only 2MB. This does not prevent Athlone from butchering his competitor.

    As you can see from the graphs, no cache can replace additional cores either in complex programs or in processor-demanding games. Athlon with 2MB cache (red) easily beats Cor2Duo with 6MB cache, even at a lower frequency and almost half the cost. Also, many people forget that the cache is present in video cards, because, generally speaking, they also have processors. A recent example is the GTX460 video card, where they manage to not only cut the bus and memory capacity (which the buyer will guess about) - but also the shader cache, respectively, from 512Kb to 384Kb (which the buyer will NOT guess about). And this will also add its negative contribution to productivity. It will also be interesting to find out the dependence of performance on cache size. Let's examine how quickly it grows with increasing cache size using the example of the same processor. As you know, processors of the E6***, E4*** and E2*** series differ only in the cache size (4, 2 and 1 MB each, respectively). Operating at the same frequency of 2400 MHz, they show the following results.

    As you can see, the results are not too different. I will say more - if a processor with a capacity of 6MB had been involved, the result would have increased a little more, because processors reach saturation. But for models with 512Kb the drop would be noticeable. In other words, 2MB is enough even for games. To summarize, we can draw the following conclusion - cache is good when there is ALREADY a lot of everything else. It is naive and stupid to change the speed of the hard drive or the number of processor cores for the cache size at the same cost, because even the most capacious sorting box will not replace another sorter. But there are also good examples.. For example, Pentium Dual-Core in an early revision on the 65 nm process had 1MB of cache for two cores (E2160 series and similar), and the later 45-nm revision of the E5200 series still has 2MB, all other things being equal (and most importantly - PRICE). Of course, you should choose the latter.

    compua.com.ua

    What is a cache, why is it needed and how does it work?

    What is the dirtiest place on a computer? Do you think it's a basket? User folders? Cooling system? You guessed wrong! The dirtiest place is the cache! After all, you have to constantly clean it!

    In fact, there are a lot of caches on a computer, and they serve not as a waste dump, but as accelerators for equipment and applications. Where did they get the reputation of being a “system trash chute”? Let's figure out what a cache is, what it is, how it works, and why it needs to be cleaned from time to time.

    A cache or cache memory is a special storage of frequently used data, which is accessed tens, hundreds and thousands of times faster than RAM or other storage media.

    Applications (web browsers, audio and video players, database editors, etc.), operating system components (thumbnail cache, DNS cache) and hardware (CPU cache L1-L3, graphics framebuffer) have their own cache memory. chip, storage buffers). It is implemented in different ways – in software and hardware.

    • A program cache is simply a separate folder or file into which, for example, pictures, menus, scripts, multimedia content and other contents of visited sites are loaded. It is this folder that the browser first dives into when you reopen a web page. Paging some content from local storage speeds up its loading and reduces network traffic.

    • In storage devices (in particular, hard drives), the cache is a separate RAM chip with a capacity of 1-256 Mb, located on the electronics board. It receives information read from the magnetic layer and not yet loaded into RAM, as well as data that is most often requested by the operating system.

    • A modern central processor contains 2-3 main levels of cache memory (also called ultra-random access memory), located in the form of hardware modules on the same chip. The fastest and smallest in size (32-64 Kb) is cache Level 1 (L1) - it operates at the same frequency as the processor. L2 occupies an average position in speed and capacity (from 128 Kb to 12 Mb). And L3 is the slowest and most voluminous (up to 40 Mb), and is absent on some models. The speed of L3 is low only relative to its faster brothers, but it is also hundreds of times faster than the most productive RAM.

    The processor's flash memory is used to store constantly used data pumped from RAM and machine code instructions. The more it is, the faster the processor.

    Today, three levels of caching are no longer the limit. With the advent of the Sandy Bridge architecture, Intel implemented an additional cache L0 (intended for storing decrypted microinstructions) in its products. And the most high-performance CPUs also have a fourth-level cache, made in the form of a separate chip.

    Schematically, the interaction of cache levels L0-L3 looks like this (using the example of Intel Xeon):

    In human language about how it all works

    To understand how cache memory functions, let's imagine a person working at a desk. The folders and documents that he uses constantly are on the table (in cache memory). To access them, just stretch out your hand.

    Papers that he needs less often are stored nearby on shelves (in RAM). To get them, you need to stand up and walk a few meters. And what a person does not currently work with is archived (recorded on the hard drive).

    The wider the table, the more documents will fit on it, which means that the worker will be able to quickly access a larger amount of information (the larger the cache capacity, the faster the program or device works, in theory).

    Sometimes he makes mistakes - he keeps papers on his desk that contain incorrect information and uses them in his work. As a result, the quality of his work decreases (cache errors lead to malfunctions of programs and hardware). To correct the situation, the employee must throw out documents with errors and put the correct ones in their place (clear the cache memory).

    The table has a limited area (cache memory has a limited capacity). Sometimes it can be expanded, for example, by moving a second table, and sometimes it cannot (the cache size can be increased if such a possibility is provided by the program; the hardware cache cannot be changed, since it is implemented in hardware).

    Another way to speed up access to more documents than the desk can accommodate is to have an assistant serve the worker papers from the shelf (the operating system can allocate some of the unused RAM to cache device data). But it's still slower than taking them from the table.

    The documents at hand should be relevant to current tasks. The employee himself must monitor this. You need to put things in order regularly (removing irrelevant data from cache memory falls on the shoulders of applications that use it; some programs have an automatic cache clearing function).

    If an employee forgets to maintain order in the workplace and keep documentation up to date, he can draw himself a desk cleaning schedule and use it as a reminder. As a last resort, entrust this to an assistant (if an application dependent on cache memory has become slower or often downloads irrelevant data, use cache cleaning tools on a schedule or perform this manipulation manually every few days).

    We actually come across “caching functions” everywhere. This includes buying groceries for future use, and various actions that we perform in passing, at the same time, etc. Essentially, this is all that saves us from unnecessary fuss and unnecessary movements, streamlines our life and makes our work easier. The computer does the same. In short, if there were no cache, it would work hundreds and thousands of times slower. And we probably wouldn't like it.

    f1comp.ru

    Cache, cache, cash - memory. What is cache memory used for? Impact of cache size and speed on performance.

    Cache - memory (cache, cash, buffer - eng.) - used in digital devices as a high-speed clipboard. Cache memory can be found on computer devices such as hard drives, processors, video cards, network cards, CD drives and many others.

    The operating principle and architecture of the cache can vary greatly.

    For example, the cache can serve as a regular clipboard. The device processes the data and transfers it to a high-speed buffer, where the controller transmits the data to the interface. Such a cache is intended to prevent errors, hardware check data for integrity, or to encode a signal from a device into an understandable signal for the interface, without delays. This system is used, for example, in CD/DVD drives.

    In another case, the cache can serve to store frequently used code and thereby speed up data processing. That is, the device does not need to calculate or look up the data again, which would take much longer than reading it from the cache. In this case, the size and speed of the cache plays a very important role.


    This architecture is most often found on hard drives, SSD drives and central processing units (CPUs).

    When devices are operating, special firmware or dispatcher programs can be loaded into the cache, which would work more slowly with ROM (read-only memory).

    Most modern devices use a mixed type of cache, which can serve as both a clipboard and store frequently used code.

    There are several very important functions implemented for the cache of processors and video chips.

    Combining execution units. Central processing units and video processors often use a fast shared cache between cores. Accordingly, if one core has processed information and it is in the cache, and a command is received for the same operation, or to work with this data, then the data will not be processed by the processor again, but will be taken from the cache for further processing. The kernel will be offloaded to process other data. This significantly increases performance in similar but complex calculations, especially if the cache is large and fast.

    The shared cache also allows cores to work with it directly, bypassing slow RAM.

    Cache for instructions. There is either a shared, very fast L1 cache for instructions and other operations, or a dedicated cache for them. The more instructions stored in a processor, the larger the instruction cache it requires. This reduces memory latency and allows the instruction block to function almost independently. When it is full, the instruction block begins to periodically become idle, which slows down the speed of calculation.

    Other functions and features.

    It is noteworthy that in CPUs (central processing units), hardware error correction (ECC) is used, because a small error in the cache can lead to one continuous error during further processing of this data.

    In the CPU and GPU there is a cache hierarchy that allows you to separate data for individual cores and general ones. Although almost all data from the second level cache is still copied to the third, general level, but not always. The first cache level is the fastest, and each subsequent one is slower, but larger in size.

    For processors, three or fewer cache levels are considered normal. This allows for a balance between speed, cache size and heat dissipation. It is difficult to find more than two cache levels in video processors.

    Cache size, impact on performance and other characteristics.

    Naturally, the larger the cache, the more data it can store and process, but there is a serious problem here.

    A large cache means a large transistor budget. In server processing units (CPUs), the cache can use up to 80% of the transistor budget. Firstly, this affects the final cost, and secondly, energy consumption and heat dissipation increase, which is not comparable to the productivity increased by several percent.