|
Blue Forest http://www.lslnet.com at 10:18 on June 6, 2006
Wrox Red Book "Advanced Programming in C # (3)"
Thread C #
Tsinghua University Publishing House to the Digest of Wrox Red Book "Advanced Programming in C # (3)," the source reproduced to be labeled
This paper presents C # and. NET base class for the development of multi-threaded applications for the support. We will be briefed Thread various types of thread and support, and then threaded two examples to illustrate the rules. Thread Synchronization then discussed the problems that would arise. Because of this very complex subject, the focus of this section is to understand some basic rules, rather than the development of real applications. The main elements are as follows :
. How to start a thread
. Providing thread priority
. synchronous control of the object through the visit
End this study, we can deal with our own code of the thread. Now all know the basic knowledge of threads.
15.1 Thread
Thread is a sequence of the implementation process. C # prepared to use any procedure that has an entrance : Main () method. Procedures from the Main () method implemented in the first sentence until this method return.
The program structure is well suited to the tasks identified a sequence of procedures, but the procedures are often necessary for the completion of various tasks. For example, start Internet Explorer and the growing need for certain pages loading time and trouble. Eventually (probably in two seconds), users will Shoot Back button, or input other URL address other pages. For this reason, Internet Explorer must do at least three things :
. return from the Internet, data and the accompanying pages of documents collected garbage bin
. showed pages
-- IE users to check the importation of other tasks (for example, see the button Shoot)
This procedure will also occur in the following occasions : in the implementation of a mandate, to show a dialog box, users can cancel at any time in this dialog box this task.
Internet Explorer examples discussed below in more detail. To simplify the problem, we have neglected the task of storing data from the Internet, Internet Explorer and on the assumption that only two tasks :
. showed pages
-- Check user input
The assumption that the Web need a longer time to be able to demonstrate, some processor-intensive JavaScript, or the need for continued updated Xuanqukuang contains elements. One way to deal with this situation is to prepare a way, it shows pages in the process, but also to carry out other work. After a while, it is assumed that 20% of a second, the method will check whether there are any user input. If so, dealing with the user input (this show will be canceled tasks). Otherwise, the method on the next page shows a 20% within one second.
This method is effective, but the implementation of a very complex way. Worse still is that it completely ignores the Windows-based structure. If any user input in the system and application procedures would be informed of an incident. The following modifications to the methodology, the use of Windows events :
. Prepare a response to user input to the procedures for handling the incident. The response should include the provision of a number of signs that show the total cessation of the process.
. Prepare a mission approach shows that the method used in the system did not do other things, indicate pages.
Such a solution is better, because it used a Windows event structure. Now look at ways to accomplish this : the work must be carefully considered from the very beginning of time. In this method of operation, the computer will not respond to any user input. This method must know that the time they have been deployed in the course of their work has been monitoring the time, once past the designated time (the time left to respond to slightly less than 10% of users UTC), we must return. Moreover, the return to the former method, it is also necessary to store the state, so that the next call that they should know where to start. This approach is certainly up with the past, the use of Windows 3.1, it is necessary to do this. Later, the introduction of Windows 95 and NT 3.1 multithreading, more convenient ways to solve the problem.
15.2 multithread application
The above example shows the need to handle more than one task the application cases, the most obvious solution is to provide a wide range of applications implementation of threads. Thread the implementation of the said computer instruction sequence. Applications should not be the only such a sequence, in fact, a number of threads can be arbitrary application. Create a new thread for each implementation, the need to specify from which methods are implemented. Application of the procedure is always a thread Main () method because it is a thread. NET Runtime library started the Main () method yes. NET Runtime library to choose the first option. Follow the thread started by the application of the internal procedures that the application can choose which thread started.
Multi-thread work
We also discussed the implementation of the thread only. In fact, a processor can only handle one task at a certain moment. If there is more than one processor system, in theory it could also implement a number of directives -- the implementation of a command processor, but most people are using the single-processor computer, and it is impossible combination. In fact, the Windows operating system can handle more than one task on the surface, this process is known as preemptive multi-task processing (pre-emptive multitasking).
The so-called preemptive multi-task processing, Windows is a thread in a process to choose the thread running short. Microsoft did not say how long this period of time, because in order to get the best performance of the Windows operating system is an internal parameter to control the time value. Windows applications in operation, users do not need to know about it. From our point of view, the time is very short, no more than a few milliseconds. Thread a very short period of time as this film time (time tomography). After this time documentaries, will recover control of Windows, choose the time and was assigned a tablet thread. These films very short time, we can think of many events simultaneously.
Even if the application is only one thread, the process is the first to handle multiple tasks, the operating system because many other processes, each process needs some time to complete its tablet thread. When the screen has many windows, each window represents a different process, clicking on the words of a one of them, it shows response. This is not an immediate response, in a process related to the window handle user input thread has been a time of films that the reaction will occur. If the system is very busy, we need to wait, but the waiting time is very short, users would not have been aware of.
15.3 thread processing
Thread types of thread is used to handle such System.Threading named in space. Thread one example of a single thread, namely, the implementation sequence. Through a simple example of a Thread object, it can create another thread.
Thread start
Make more of the following codes specific to assume that the preparation of a graphic image editor, users request changes image color depth. For a large image, the operation will need more time to complete. Time to establish a separate thread to handle this process, the depth changes in color, user interface, users can not be interrupted. First case of a Thread object :
4003rd has been declared previously as a delegate entryPoint
4003rd of type ThreadStart
Thread depthChangeThread = new Thread (entryPoint);
DepthChangeThread this code specified variable name.
:
Creating an application in another thread, to perform some tasks, threads, commonly known as the work (worker thread).
Note the above code, Thread constructor function requires a parameter -- for the entrance to the designated thread that started the thread. As we approach the detailed information is transmitted, it is necessary to use the proxy. Indeed, the System has been commissioned. Threading a well defined categories. It called ThreadStart their signatures as follows :
Public delegate void ThreadStart ();
Transmitted to the constructor function of the parameters of this type must be commissioned.
But after the completion of the new thread does not mandate, it is only awaiting implementation. We call Thread.Start () method to initiate thread.
Assume a way ChangeColorDepth () :
Void ChangeColorDepth ()
{
4003rd to change color depth of image processing
}
Implementation of the following codes :
Thread depthChangeThread = new Thread ();
DepthChangeThread.Name = "Depth Change Thread";
ThreadStart entryPoint = new ThreadStart (ChangeColorDepth);
DepthChangeThread.Start ();
Upon completion, will be running two threads.
Plan 15-1
In this code, the thread used to give a friendly Thread.Name attribute name, as shown in Figure 15-1, it is not necessary, but very effective.
That the import of thread (in this case is ChangeColorDepth ()) without any parameters, it is necessary to use other means of the method of transmission of the information it needs. The most obvious way is to use a field belonging to the category of members. Moreover, this method does not return value (if any return values, they should return to where? Once this method of return, the thread will terminate its operation, it can not accept any return value. We can hardly call it back to the thread to thread, the thread is probably as busy doing other things).
Started a thread, can be touched, or when the aircraft to restore it. Is a thread linking it to enter the state of sleep, at this time, only to stop thread running time, but do not take up any processor time, could also be restored, re-operation from the history of that state. If the thread was when the plane was stopped operation. Windows will permanently delete all the data the thread, the thread can be restarted.
Image Editor continue above example, assuming that for some reason, the user interface thread to show a dialog box, which allows the user to select interim oral history process (users do not normally do, but this is only an example, in a true example, users may be suspended or sound files of the documents broadcast video). Prepared to respond to the following : the main thread
DepthChangeThread.Suspend ();
If users require restoration after the thread can use the following method :
DepthChangeThread.Resume ();
Finally, if the user (true) decided not to hold such a conversation, Shoot cancel button, we can use the following method :
DepthChangeThread.Abort ();
Attention to Suspend () and Abort () method need not be immediately useful. For Suspend () method. NET thread linking to allow further implementation of several directives, in order to reach. NET thread that can link to the security state. To do so, technically speaking, is to ensure that the correct operation of refuse collection for the implementation of specific elements of MSDN document shows. When the plane in the thread, Abort () method in the thread have affected a ThreadAbortException, ThreadAbortException is a special type of abnormality, we have not encountered before. When the machine thread in this manner, if the current thread try to implement the code block, in the thread when the real machine, corresponding to the implementation of the finally block. This can guarantee clean up resources and the opportunity to ensure that the data being processed thread (for example, when the aircraft remain in the thread examples of the types of fields) under effective state.
:
In the development. NET ago, when the machine is not recommended to use this method thread, except in extreme circumstances, the thread will be immediately affected because the plane when it is processing the data will be invalid in the state, the resources are still being occupied by the thread. . NET unusual mechanism to allow the use of the thread when the machine more secure.
When this unusual mechanism to allow the machine threads relatively safe, but when planes thread a certain period of time, since theoretically, finally block implementation of the code was no limit on how long. Therefore, when the machine thread after a waiting period, when the threads are really machine, the other can continue to operate. If follow-up processing dependent on another thread has been terminated, are available Join (), when the plane waiting thread :
DepthChangeThread.Abort ();
DepthChangeThread.Join ();
Join () method can be specified waiting for the other heavy period of time. If, after a waiting period of time, the process will continue. If there is no specified time period, the thread will wait for time to wait.
The above code also showed that in a thread of the implementation of the operation of another thread (at least Join (), is waiting for another thread). However, if the main thread to thread in the implementation of some of its own operation, the how? It is necessary to express a thread object under its own thread. Thread use of the static type attribute CurrentThread, get a quote :
Thread myOwnThread = Thread.CurrentThread;
Thread is not easy to deal with the fact that because even in the absence of previous examples of other threads, there is always a thread : the thread currently being implemented. As with other types of processing of the two different categories :
. example of a thread can be targeted, it said the operation is a thread, for example members of the threads are running on.
. months arbitrary static method can be deployed. These methods usually applied to the threads on the actual transfer.
Sleep is a static method can call (), it is the thread running into sleep, the thread will continue for a period of time after the operation. |
Examples 15.4 ThreadPlayaround
Now ThreadPlayaround a simple example to illustrate the use of threads. The purpose of this example is how to handle thread, and not in terms of actual programming problem.
ThreadPlayaround example is the core method DisplayNumbers (), it is a cumulative figure and the cumulative results of each show. DisplayNumbers () operation it will show the names and cultural background thread :
Static void DisplayNumbers ()
{
Thread thisThread = Thread.CurrentThread;
String name = thisThread.Name;
Console.WriteLine ( "Starting thread :" + name);
Console.WriteLine (name + ": Current Culture =" +
ThisThread.CurrentCulture);
For (int i = 1; i<= 8*interval; i++)
{
If (i%interval ====== 0)
Console.WriteLine (name + ": count has reached" + i);
}
}
Cumulative figure depends on the interval field, which is the value of user input. If the user to input 100, cumulative to 800, figures show 100, 200, 300, 400, 500, 600, 700 and 800, if the user of 1000, cumulative to 8000, figures show 1000, 2000, 3000, 4000, 5000, 6000, 7000 and 8000, followed by analogy. It seems to be a meaningless, but its purpose is to allow processors to stop some time, to see the processors is how to handle this task.
ThreadPlayaround examples of the work started a second thread operations DisplayNumbers (), but the work started this thread, the main thread began with the implementation of a method that at this time we should see a cumulative process of the two simultaneously.
ThreadPlayaround example of the Main () method contains the categories as follows :
Class EntryPoint
{
Static int interval;
Static void Main ()
{
Console.Write ( "Interval to display results at?>");
Interval = int.Parse (Console.ReadLine ());
Thread thisThread = Thread.CurrentThread;
ThisThread.Name = "Main Thread";
ThreadStart workerStart = new ThreadStart (StartMethod);
Thread workerThread = new Thread (workerStart);
WorkerThread.Name = "Worker";
WorkerThread.Start ();
DisplayNumbers ();
Console.WriteLine ( "Main Thread Finished");
Console.ReadLine ();
}
}
The code of the type of statement from the beginning, this is the kind of a static field interval. In the Main () method, which requires the user input interval value. Before the acquisition, said the main thread Thread object reference, and be able to thread Assigned Names and can see concrete results of the implementation.
Then, thread work, set up its name, started it, transmit it to a commission, it must be specified method WorkerStart started last DisplayNumbers Call () method, the beginning cumulative. Thread work is the entrance :
Static void StartMethod ()
{
DisplayNumbers ();
Console.WriteLine ( "Worker Thread Finished");
}
Paying attention to all these methods are static methods EntryPoint category. Two cumulative process is fully independent, because DisplayNumbers () method for the variable i is a cumulative number of local variables. Local variables defined only in the methods they use, and only in the implementation of the method of thread is visible. If another thread started this way, the thread will get a copy of the local variables. Operation of this code to choose a relatively small interval of 100, the following results were obtained :
ThreadPlayaround
Interval to display results at?> 100
Starting thread : Main Thread
Main Thread : Current Culture = en-US
Main Thread count has reached 100 :
Main Thread count has reached 200 :
Main Thread count has reached 300 :
Main Thread count has reached 400 :
Main Thread count has reached 500 :
Main Thread count has reached 600 :
Main Thread count has reached 700 :
Main Thread count has reached 800 :
Finished Main Thread
Starting thread : Worker
Worker : Current Culture = en-US
Worker : count has reached 100
Worker : count has reached 200
Worker : count has reached 300
Worker : count has reached 400
Worker : count has reached 500
Worker : count has reached 600
Worker : count has reached 700
Worker : count has reached 800
Worker Thread Finished
For parallel threads, the implementation has been extremely successful and the two threads. Following the start of the main thread, add up to 800 after completion of the implementation, and then start working thread, cumulative process of implementation.
The problem here is to activate the main thread is a process in the case of a new thread, the main thread will encounter the following code :
WorkerThread.Start ();
Calling it Thread.Start (), tell Windows has been preparing to launch a new thread, and then return immediately. The accumulation of 800 hours, a new thread on the Windows, which means that resources are allocated to the thread, the implementation of a safety inspection. Start to a new thread, the main thread has completed its task.
The way to solve this problem is to choose a larger interval, so that two threads in DisplayNumbers () method and the time will be longer, the importation of one million to the interval, the results have been as follows :
ThreadPlayaround
Interval to display results at?> 1000000
Starting thread : Main Thread
Main Thread : Current Culture = en-US
Main Thread count has reached 1 million :
Starting thread : Worker
Worker : Current Culture = en-US
Main Thread count has reached 2000000 :
Worker : count has reached one million
Main Thread count has reached 3000000 :
Worker : count has reached two million
Main Thread count has reached four million :
Worker : count has reached three million
Main Thread count has reached five million :
Main Thread count has reached 6000000 :
Worker : count has reached four million
Main Thread count has reached 7000000 :
Worker : count has reached five million
Main Thread count has reached eight million :
Finished Main Thread
Worker : count has reached 6000000
Worker : count has reached 7000000
Worker : count has reached eight million
Worker Thread Finished
Now we can see that this is actually two parallel threads work. Threads start to accumulate one million, when calculated on a one million threads, thread work started since then, the two threads to accumulate the same speed until the mission is done.
Unless the operation of more than one processor computer, the CPU-intensive tasks can save time using two threads, it is very important to understand this point. In single-processor computer, threads are two to eight million cumulative amount of time spent with a thread so that accumulation of 16 million is the same, or even use two threads which will be used for a rather long time, because another thread to deal with the operating system must use a certain amount of time switching threads, but this difference can be neglected. There are two advantages to the use of multiple threads. First, it can respond as a thread in the user input, another thread in the background to complete other work. Second, if one or more threads are dealing with the off-CPU time (for example, to wait for Internet access data), we can save time, Other threads could not activate the thread waiting for the state to implement their mandates. |
15.5 thread priority
If the application process in a number of threads running, but some thread important than some other thread, the how? In such circumstances, it can be different threads in a process to designate a different priority level. Under normal circumstances, if a higher priority thread in the work of the lower-priority thread not give any time distribution of films, its strengths is guaranteed to receive user input to the thread designated high priority. In most of the time, the threads do nothing, while other thread is the implementation of their respective mandates. However, if the users input the information immediately available on the thread other than the thread applications higher priority in the short term user input events.
The high-priority threads can prevent the implementation of a low-priority thread, the thread priority to change in 1545. Thread priority can be defined as ThreadPriority enumeration values, Highest, AboveNormal, Normal, BelowNormal, and Lowest.
That each process has a fundamental priority, the priority of these values and processes are related. To the designated high-priority threads, it can ensure that the process of implementation of priority than other thread, but the system may also operates other processes, they have higher priority thread. So Windows operating system thread to their designated high priority.
In ThreadPlayaround example, the Main () method to do the following modifications, we can see the effects of changes thread priority :
ThreadStart workerStart = new ThreadStart (StartMethod);
Thread workerThread = new Thread (workerStart);
WorkerThread.Name = "Worker";
WorkerThread.Priority = ThreadPriority.AboveNormal;
WorkerThread.Start ();
Among them, the thread priority than the main thread work, the results are as follows :
ThreadPlayaroundWithPriorities
Interval to display results at?> 1000000
Starting thread : Main Thread
Main Thread : Current Culture = en-US
Starting thread : Worker
Worker : Current Culture = en-US
Main Thread count has reached 1 million :
Worker : count has reached one million
Worker : count has reached two million
Worker : count has reached three million
Worker : count has reached four million
Worker : count has reached five million
Worker : count has reached 6000000
Worker : count has reached 7000000
Worker : count has reached eight million
Worker Thread Finished
Main Thread count has reached 2000000 :
Main Thread count has reached 3000000 :
Main Thread count has reached four million :
Main Thread count has reached five million :
Main Thread count has reached 6000000 :
Main Thread count has reached 7000000 :
Main Thread count has reached eight million :
Finished Main Thread
This shows that when the work thread priority to AboveNormal, once the work thread was started on the main thread is no longer in operation.
15.6 Synchronization
Use thread synchronization visit is an important aspect of any visit to a number of variables thread. The so-called synchronization is only one thread at a time can visit variables. If we can not ensure that the visit of variables simultaneously, we will be wrong. This section will briefly outline some of the key elements simultaneously.
15.6.1 synchronization meaning
Synchronization is generated in C # source code, the majority of cases appear to be a sentence, but in the end a good assembly language and machine translation will be translated into many of the sentences. See the following lines :
20 +6 = 26 and finally 26-8 message, "there"; 4003rd message is a string that contains "Hello"
This is one of the sentences in the C # language syntax, but in code, it actually involves many different operations. Need to allocate memory to store the new string longer, the need for variable message so at the new memory, the actual text of the need for such systems.
Obviously, the choice of a complicated string, but even in the implementation of the basic types of digital arithmetic operation, background operation than from the C # code to see more. Moreover, many operations can not directly stored in the memory space on the variables and their values must separate processor to a specific location on the complex system that register.
C # language as long as a translation of a number of aircraft orders code, the film threads may be in the implementation of the language in the process of termination and, if so, the same process another thread will be given a time slice, in the case of notice to the variable visit (in the above example, is the message) is not synchronized. Then another thread could read and write with a variable. In the example above, another thread is the message to visit the old value or the new value?
This problem may be more serious than that. The phrase used in the above examples are relatively simple, but is more complicated in the implementation of the sentence, a certain variables in the implementation of the expression is not defined in a relatively short period of time value. If another thread at this time to read this value will only read a garbage value. Even more serious is that if two threads simultaneously with the data into a variable, the variable will contain incorrect values.
Synchronization will not affect ThreadPlayAround example, as in the example, the two main threads to use local variables. Threads can visit these two variables is the only interval field, but before the commencement of other thread, the main thread in this field has been initialized, the thread after only two in the value of reading it, no problem. Synchronization only in the following occasions : at least one thread to be included in a variable, while the other thread is read or written into the same variable.
C # variable visit to simultaneously provide a very simple way that the use of the C # lock keyword, its usage as follows :
Lock (x)
{
DoSomething ();
}
Put words in parentheses variables lock on to the packaging, known as exclusive lock or exclusive lock. When the implementation of the compound with a lock keyword phrases, exclusive locks will be retained. When the variables are packaged in exclusive lock, the thread would not visit the other variables. If the code used in the above exclusive lock in complex sentences, this thread will lose its time slice. If the next time slice of the thread is trying to visit a variable x, it will be rejected. Windows allow other threads in a sleep state until the lifting of the exclusive lock so far.
Many of the control variables exclusive locking mechanism is very simple. Not here - depth discussion of other mechanisms, but they were able to pass through. NET Base Class System.Threading.Monitor control. In fact, the lock C # C # language syntax is a packaging, which will package two of this kind of method invocation.
Under normal circumstances, when a thread into a variable, or any other thread to read into this variable, it should be synchronized variables. Thread synchronization is not detailed here, but this is a great theme, the synchronization of the two potential problems discussed below.
15.6.2 Synchronization
Thread Synchronization is very important in multithread applications. However, it is a need to discuss in detail the content, because it is prone to subtle and difficult to detect, especially Deadlock dead lock and condition race conditions.
(1) not to use simultaneously
Thread Synchronization is very important, but only when needed is very important. Because it would reduce performance. There are two reasons : First, the objects in place will bring some reconciliation unlock the system overhead, but these are very small system overhead. The second reason is even more important, to use the more thread, the more threads waiting for the release of the object. If a thread object placed in a locked, the other threads need to visit the target only suspended until the lock was removed, can continue to be enforced. Therefore, in the preparation of the internal code of little pieces click avoid thread synchronization errors. Lock sentences, in a sense, is temporary ban on the application multithreading function will delete the temporary advantage of multithreading.
The other hand, the risk of excessive thread synchronization (lower performance and response) is not necessary, the use of high-thread synchronization (difficult to track the operational error).
(2) Deadlock
Deadlock is a mistake, the two interlocking threads of the resources needed to visit occurred. Presuppose a thread running the following code, a and b are two threads can visit the object reference :
Lock (a)
{
4003rd do something
Lock (b)
{
4003rd do something
}
}
Meanwhile, another thread running the following code :
Lock (b)
{
4003rd do something
Lock (a)
{
4003rd do something
}
}
According to the thread encounter different language, there may be the following : First, there is a thread in a lock, there is a second thread in b lock. Not long, thread A encounter lock (b) notice immediately the development of the human sleep, waiting for the locks to be untied b. After the second thread encountered lock (a) sentence, people immediately into sleep, waiting in a Windows awakened when the lock was untied it. But never a lock on the untied threads because of the possession of a lock, which is now in the state of sleep in the parking lot had been untied b former is not a sober, and the second thread to be awakened, b lock will not untie the result is a deadlock. Two threads will not do anything, but just wait another thread untied their locks. Linking the entire application process such problems would not carry out any operation, unless the use of the "task manager" interrupted throughout the process.
:
In such circumstances, another thread untie exclusive lock : lock the threads untied only by the definition of it.
In order to enable two threads to the same statement, locked objects, it avoided deadlock. In the above example, if the second thread locking statement with a thread of the same order, a first b, which were both at a thread on the lock will be the first to complete its task before launching another thread. In this way, there will not be a deadlock.
Deadlock in coding very easy to avoid, in the above code, it is very apparent deadlock occurred, so users will not produce such a code, but remember the lock which can be deployed in different ways. In this example, the first practical implementation of a thread following code :
Lock (a)
{
4003rd do bits of processing
CallSomeMethod ()
}
CallSomeMethod () can be deployed in other ways, with a lock (b) notice, time to prepare a code, it will not be clear whether there will be a deadlock.
(3) condition
Deadlock over the more delicate condition. It rarely interrupted the implementation process, it could lead to damage data. Difficult to give a precise definition of state competition, but several attempts to visit with a thread, but did not fully consider the implementation of the other thread, the state competition will take place. Use an example to understand the best condition.
Assumed to be a target array in which each element needs to be addressed, and there are now many threads to such treatment. ArrayController assumed to be an object, which contains an int array and the object, the object has been disposed int said the number, which should be dealt with next target. ArrayController implementation of the following :
int GetObject(int index)
{
// returns the object at the given index.
}
和一个读写属性:
int ObjectsProcessed
{
// indicates how many of the objects have been processed.
}
帮助处理对象的每个线程都执行下述代码:
lock(ArrayController)
{
int nextIndex = ArrayController.ObjectsProcessed;
Console.WriteLine("object to be processed next is " + index);
++ArrayController.ObjectsProcessed;
object next = ArrayController.GetObject();
}
ProcessObject(next);
这段代码可以工作,但假定为了避免资源被长期搁置不用,在显示用户信息时不在ArrayController上放置锁。因此,把上述代码重写为:
lock(ArrayController)
{
int nextIndex = ArrayController.ObjectsProcessed;
}
Console.WriteLine("object to be processed next is " + index);
lock(ArrayController)
{
++ArrayController.ObjectsProcessed;
object next = ArrayController.GetObject();
}
ProcessObject(next);
现在可能有一个问题。在一个线程获得数组中的第11个对象,并显示信息,说明它在处理该对象时,会发生什么?与此同时,第二个线程也开始执行相同的代码,调用ObjectsProcessed,并确定要处理的下一个对象就是数组中的第11个对象——因为第一个线程仍然还没有更新ArrayController.ObjectsProcessed。在第二个线程告诉控制台,它正在处理第11个对象时,第一个线程在ArrayController上放置了另一个锁,并在这个锁内部递增了ObjectsProcessed。但太迟了,这两个线程在处理同一个对象,此时的情形就称为竞态条件。
对于死锁和竞态条件,出现这两种错误的条件常常不明显,如果有这样的条件,也很难识别错误。一般情况下,这需要一定的经验。但是,在编写多线程应用程序时,如果需要同步,就必须考虑代码的所有部分,检查是否有可能发生死锁或竞态条件。记住,不可能预见不同线程遇到不同语句的确切时间。
15.7 小结
本文介绍了如何通过System.Threading命名空间编写多线程应用程序。在应用程序中使用多线程要仔细规划。太多的线程会导致资源问题,线程不足又会使应用程序执行缓慢,执行效果也不好。
.NET Framework中的System.Threading命名空间允许处理线程,但.NET Framework并没有完成多线程中所有困难的任务。我们必须考虑线程的优先级和同步问题。本文讨论了这些问题,介绍了如何在C#应用程序中为它们编码。还论述了与死锁和竞态条件相关的问题。
如果要在C#应用程序中使用多线程功能,就必须仔细规划。
本文摘至清华大学出版社出版的Wrox红皮书《C#高级编程(第3版)》,转载必须标明出处 |
C#中的内存管理和指针(一)
本文摘至清华大学出版社出版的Wrox红皮书《C#高级编程(第3版)》,转载必须标明出处
本文介绍内存管理和内存访问的各个方面。尽管运行库负责为程序员处理大部分内存管理工作,但程序员仍必须理解内存管理的工作原理,知道如何处理未托管的资源。
如果很好地理解了内存管理和C#提供的指针功能,也就能很好地集成C#代码和原来的代码,并能在非常注重性能的系统中高效地处理内存。
本文的主要内容如下:
● 运行库如何在堆栈和堆上分配空间
● 垃圾收集的工作原理
● 如何使用析构函数和System.IDisposable接口来确保未托管的资源的正确释放
● C#中使用指针的语法
● 如何使用指针实现高性能且基于堆栈的数组
7.1 后台内存管理
C#编程的一个优点是程序员不需要担心具体的内存管理,尤其是垃圾收集器会处理所有的内存清理工作。用户可以得到像C++语言那样的效率,而不需要考虑像在C++中那样内存管理工作的复杂性。虽然不必手工管理内存,但如果要编写高效的代码,就仍需理解后台发生的事情。本节要介绍给变量分配内存时计算机内存中发生的情况。
注意:
本节的许多内容是没有经过事实证明的。您应把这一节看作是一般规则的简化向导,而不是实现的确切说明。
7.1.1 值数据类型
Windows使用一个系统:虚拟寻址系统,该系统把程序可用的内存地址映射到硬件内存中的实际地址上,这些任务完全由Windows在后台管理,其实际结果是32位处理器上的每个进程都可以使用4GB的内存—— 无论计算机上有多少硬盘空间。(在64位处理器上,这个数字会更大)。这个4GB内存实际上包含了程序的任何一部分—— 包括可执行代码、代码加载的所有DLL,以及程序运行时使用的所有变量的内容。这个4GB内存称为虚拟地址空间,或虚拟内存,为了方便起见,我们继续把它当作一般内存来使用。
4GB中的每个存储单元都是从0开始往上排序的。要把一个值存储在内存的某个空间中,就需要提供表示该存储单元的数字。在任何高级语言中,例如C#、VB、C++和Java,编译器负责把人们可以理解的名称转换为处理器可以理解的内存地址。
在进程的虚拟内存中,有一个区域称为堆栈。堆栈存储不是对象成员的值数据类型。另外,在调用一个方法时,也使用堆栈复制传递给方法的所有参数。为了理解堆栈的工作原理,需要注意在C#中变量的作用域。如果变量a在变量b之前进入作用域,b就会先出作用域。下面的代码:
{
int a;
// do something
{
int b;
// do something else
}
}
首先声明a。在内部的代码块中声明了b。然后内部的代码块终止,b就出作用域,最后a出作用域。所以b的生存期会完全包含在a的生存期中。在解除变量时,其顺序总是与给它们分配内存的顺序相反,这就是堆栈的工作方式。
我们不知道堆栈在地址空间的什么地方,这些信息在进行C#开发是不需要知道的。堆栈指针(操作系统维护的一个变量) 包含堆栈中下一个自由空间的地址。程序第一次运行时,堆栈指针指向堆栈保留的内存块末尾。堆栈实际上是向下填充的,即从高内存地址向低内存地址填充。当数据入栈后,堆栈指针就会随之调整,以始终指向下一个自由空间。这种情况如图7-1所示。在该图中,显示了堆栈指针800000(16进制的0xC3500),下一个自由空间是地址79999。
图 7-1
在下面的代码中,我们已告诉编译器需要一些存储单元以存储一个整数和一个双精度浮点数,这些存储单元会分别分配给nRacingCars和engineSize,声明每个变量的代码表示开始请求访问这个变量,闭合花括号表示不再请求其他变量。
{
int nRacingCars = 10;
double engineSize = 3000.0;
// do calculations;
}
假定使用如图7-1所示的堆栈。变量nRacingCars放在内存中,其值是10,这个值放在存储单元799996~799999上,这4个字节就在堆栈指针所指空间的下面。有4个字节是因为存储int要使用4个字节。为了容纳该int,应从堆栈指针中减去4,所以它现在指向位置799996,即下一个自由空间之后(79995)。
当engineSize出作用域时,计算机就知道不再需要这个变量了。因为变量的生存期总是嵌套的,可以保证,当engineSize在作用域中时,无论发生什么情况,堆栈指针总是会指向存储engineSize的空间。为了从内存中删除这个变量,应给堆栈指针递增8,现在指向engineSize使用过的空间。此处就是放置闭合花括号的地方,当nRacingCars也出作用域时,堆栈指针就再次递增4,此时如果内存中又放入另一个变量,从799999开始的存储单元就会被覆盖,这些空间以前是存储nRacingCars的。
如果编译器遇到像int i、j这样的代码,则这两个变量进入作用域的顺序就是不确定的:两个变量是同时声明的,也是同时出作用域的。此时,变量以什么顺序从内存中删除就不重要了。编译器在内部会确保先放在内存中的那个变量后删除,这样就能保证该规则不会与变量的生存期冲突。
7.1.2 引用数据类型
堆栈有非常高的性能,但对于所有的变量来说还是不太灵活。变量的生存期必须嵌套,在许多情况下,这种要求都过于苛刻。通常我们希望使用一个方法分配内存,来存储一些数据,并在方法退出后的很长一段时间内数据仍是可以使用的。只要是用new运算符来请求存储空间,就存在这种可能性——例如所有的引用类型。此时就要使用托管堆。
如果以前编写过需要管理低级内存的C++代码,就会很熟悉堆(heap)。托管堆和C++使用的堆不同,它在垃圾收集器的控制下工作,与传统的堆相比有很显著的性能优势。
托管堆(或简称为堆)是进程的可用4GB中的另一个内存区域。要了解堆的工作原理和如何为引用数据类型分配内存,看看下面的代码:
void DoWork()
{
Customer arabel;
arabel = new Customer();
Customer mrJones = new Nevermore60Customer();
}
在这段代码中,假定存在两个类Customer 和 Nevermore60Customer。这些类实际上取自于附录A中的Mortimer Phones例子(在www.wrox.com上)。
首先,声明一个Customer引用,该引用名为arabel,在堆栈上给这个引用分配存储空间,但这仅是一个引用,而不是实际的Customer对象。arabel引用占用4个字节的空间,包含了存储Customer对象的地址(需要4个字节把0到4GB之间的地址存储为一个整数值)。
然后看下一行代码:
arabel = new Customer();
这行代码完成了以下操作:首先,分配堆上的内存,以存储Customer实例(一个真正的实例,不只是一个地址)。然后把变量arabel的值设置为分配给新Customer对象的内存地址(它还调用合适的Customer()构造函数初始化类实例中的字段,但我们不必担心这部分)。
Customer实例没有放在堆栈中,而是放在内存的堆中。在这个例子中,现在还不知道一个Customer对象占用多少字节,但为了讨论方便,假定是32B。这32B包含了Customer实例字段,和.NET用于识别和管理其类实例的一些信息。
为了在堆上找到一个存储新Customer对象的存储位置,.NET运行环境在堆中搜索,选取第一个未使用的、32B的连续块。为了讨论方便,假定其地址是200000,arabel引用占用堆栈中的799996~799999位置。这表示在实例化arabel对象前,内存的内容应如图7-2所示。
图 7-2
给Customer对象分配空间后,内存内容应如图7-3所示。注意,与堆栈不同,堆上的内存是向上分配的,所以自由空间在已用空间的上面。
图 7-3
下一行代码声明了一个Customer引用,并实例化一个Customer对象。在这个例子中,需要在堆栈上为mrJones引用分配空间,同时,也需要在堆上为它分配空间:
Customer mrJones = new Nevermore60Customer();
该行把堆栈上的4B分配给mrJones引用,它存储在799992~799995位置上,而mrJones实例在堆上从200032开始向上分配空间。
从这个例子可以看出,建立引用变量的过程要比建立值变量的过程更复杂,且不能避免性能的降低。实际上,我们对这个过程进行了过份的简化,因为.NET运行库需要保存堆的状态信息,在堆中添加新数据时,这些信息也需要更新。尽管有这些性能损失,但仍有一种机制,在给变量分配内存时,不会受到堆栈的限制。把一个引用变量的值赋予另一个相同类型的变量,就有两个引用内存中同一对象的变量了。当一个引用变量出作用域时,它会从堆栈中删除,如上一节所述,但引用对象的数据仍保留在堆中,一直到程序停止,或垃圾收集器删除它为止,而只有在该数据不再被任何变量引用时,才会被删除。
7.1.3 垃圾收集
由上面的讨论和图可以看出,托管堆的工作方式非常类似于堆栈,在某种程度上,连续的对象会在内存中一个挨一个地放置,这样就很容易使用指向下一个空闲存储单元的堆指针,来确定下一个对象的位置。在堆上添加更多的对象时,也容易调整。但这比较复杂,因为基于堆的对象的生存期与引用它们的基于堆栈的对象的作用域不匹配。
在垃圾收集器运行时,会在堆中删除不再引用的所有对象。在完成删除动作后,堆会立即把对象分散开来,与已经释放的内存混合在一起,如图7-4所示。
图 7-4
如果托管的堆也是这样,在其上给新对象分配内存就成为一个很难处理的过程,运行库必须搜索整个堆,才能找到足够大的内存块来存储每个新对象。但是,垃圾收集器不会让堆处于这种状态。只要它释放了能释放的所有对象,就会压缩其他对象,把它们都移动回堆的端部,再次形成一个连续的块。因此,对于在什么地方存储新对象,堆可以继续像堆栈那样工作。当然,在移动对象时,这些对象的所有引用都需要用正确的新地址来更新,但垃圾收集器也会处理更新问题。
垃圾收集器的这个压缩操作是托管的堆与旧未托管的堆的区别所在。使用托管的堆,就只需要读取堆指针的值即可,而不是搜索链接地址列表,来查找一个地方来放置新数据。因此,在.NET下实例化对象要快得多。有趣的是,访问它们也比较快,因为对象会压缩到堆上相同的内存区域,这样需要交换的页面较少。Microsoft相信,尽管垃圾收集器需要做一些工作,修改它移动的所有对象引用,致使性能降低,但这些性能会得到弥补。
注意:
一般情况下,垃圾收集器在.NET运行库认为需要时运行。可以通过调用System.GC.Collect(),强迫垃圾收集器在代码的某个地方运行,System.GC是一个表示垃圾收集器的.NET基类, Collect()方法则调用垃圾收集器。但是,这种方式适用的场合很少,例如,代码中有大量的对象刚刚停止引用,就适合调用垃圾收集器。但是,垃圾收集器的逻辑不能保证在一次垃圾收集过程中,所有未引用的对象都从堆中删除。 |
C#中的内存管理和指针(二)
7.2 释放未托管的资源
垃圾收集器的出现意味着,通常不需要担心不再需要的对象,只要让这些对象的所有引用都超出作用域,并允许垃圾收集器在需要时释放资源即可。但是,垃圾收集器不知道如何释放未托管的资源(例如文件句柄、网络连接和数据库连接)。托管类在封装对未托管资源的直接或间接引用时,需要制定专门的规则,确保未托管的资源在回收类的一个实例时释放。
在定义一个类时,可以使用两种机制来自动释放未托管的资源。这些机制常常放在一起实现,因为每个机制都为问题提供了略为不同的解决方法。这两个机制是:
● 声明一个析构函数,作为类的一个成员
● 在类中实现System.IDisposable接口
下面依次讨论这两个机制,然后介绍如何同时实现它们,以获得最佳的效果。
7.2.1 析构函数
前面介绍了构造函数可以指定必须在创建类的实例时进行的某些操作,在垃圾收集器删除对象时,也可以调用析构函数。由于执行这个操作,所以析构函数初看起来似乎是放置释放未托管资源、执行一般清理操作的代码的最佳地方。但是,事情并不是如此简单。
注意:
在讨论C#中的析构函数时,在底层的.NET结构中,这些函数称为Finalizers。在C#中定义析构函数时,编译器发送给程序集的实际上是Finalize()方法。这不会影响源代码,但如果需要查看程序集的内容,就应知道这个事实。
C++开发人员应很熟悉析构函数的语法,它看起来类似于一个方法,与包含类同名,但前面加上了一个发音符号(~)。它没有返回类型,不带参数,没有访问修饰符。下面是一个 例子:
class MyClass
{
~MyClass()
{
// implementation
}
}
C#编译器在编译析构函数时,会隐式地把析构函数的代码编译为Finalize()方法的对应代码,确保执行父类的Finalize()方法。下面列出了编译器为~MyClass()析构函数生成的IL的对应C#代码:
protected override void Finalize()
{
try
{
// implementation
}
finally
{
base. Finalize();
}
}
如上所示,在~MyClass()析构函数中执行的代码封装在Finalize()方法的一个try块中。对父类Finalize()方法的调用放在finally块中,确保该调用的执行。
有经验的C++开发人员扩展了析构函数的用法,有时不仅用于清理资源,还提供调试信息或执行其他任务。C#析构函数的使用要比在C++中少得多,与C++析构函数相比,C#析构函数的问题是它们的不确定性。在删除C++对象时,其析构函数会立即运行。但由于垃圾收集器的工作方式,无法确定C#对象的析构函数何时执行。所以,不能在析构函数中放置需要在某一时刻运行的代码,也不应使用能以任意顺序对不同类实例调用的析构函数。如果对象占用了宝贵而重要的资源,应尽可能快地释放这些资源,此时就不能等待垃圾收集器来释放了。
另一个问题是析构函数的执行会延迟对象最终从内存中删除的时间。没有析构函数的对象会在垃圾收集器的一次处理中从内存中删除,但有析构函数的对象需要两次处理才能删除:第一次调用析构函数时,没有删除对象,第二次调用才真正删除对象。另外,运行库使用一个线程来执行所有对象的Finalize()方法。如果频繁使用析构函数,而且使用它们执行长时间的清理任务,对性能的影响就会非常显著。
7.2.2 IDisposable接口
一个推荐替代析构函数的方式是使用System.IDisposable接口。IDisposable接口定义了一个模式(具有语言级的支持),为释放未托管的资源提供了确定的机制,并避免产生析构函数固有的与垃圾函数器相关的问题。IDisposable接口声明了一个方法Dispose(),它不带参数,返回void,Myclass的方法Dispose()的执行代码如下:
class Myclass : IDisposable
{
public void Dispose()
{
// implementation
}
}
Dispose()的执行代码显式释放由对象直接使用的所有未托管资源,并在所有实现IDisposable接口的封装对象上调用Dispose()。这样,Dispose()方法在释放未托管资源时提供了精确的控制。
假定有一个类ResourceGobbler,它使用某些外部资源,且执行IDisposable接口。如果要实例化这个类的实例,使用它,然后释放它,就可以使用下面的代码:
ResourceGobbler theInstance = new ResourceGobbler();
// do your processing
theInstance.Dispose();
如果在处理过程中出现异常,这段代码就没有释放theInstance使用的资源,所以应使用try块,编写下面的代码:
ResourceGobbler theInstance = null;
try
{
theInstance = new ResourceGobbler();
// do your processing
}
finally
{
if (theInstance != null) theInstance.Dispose();
}
即使在处理过程中出现了异常,这个版本也可以确保总是在theInstance上调用Dispose(),总是释放由theInstance使用的资源。但是,如果总是要重复这样的结构,代码就很容易被混淆。C#提供了一种语法,可以确保在引用超出作用域时,在对象上自动调用Dispose()(但不是Close())。该语法使用了using关键字来完成这一工作—— 但目前,在完全不同的环境下,它与命名空间没有关系。下面的代码生成与try块相对应的IL代码:
using (ResourceGobbler theInstance = new ResourceGobbler())
{
// do your processing
}
using语句的后面是一对圆括号,其中是引用变量的声明和实例化,该语句使变量放在随附的复合语句中。另外,在变量超出作用域时,即使出现异常,也会自动调用其Dispose()方法。如果已经使用try块来捕获其他异常,就会比较清晰,如果避免使用using语句,仅在已有的try块的finally子句中调用Dispose(),还可以避免进行额外的缩进。
注意:
对于某些类来说,使用Close()要比Dispose()更富有逻辑性,例如,在处理文件或数据库连接时,就是这样。在这些情况下,常常实现IDisposable接口,再执行一个独立的Close()方法,来调用Dispose()。这种方法在类的使用上比较清晰,还支持C#提供的using语句。
7.2.3 实现IDisposable接口和析构函数
前面的章节讨论了类所使用的释放未托管资源的两种方式:
● 利用运行库强制执行的析构函数,但析构函数的执行是不确定的,而且,由于垃圾收集器的工作方式,它会给运行库增加不可接受的系统开销。
● IDisposable接口提供了一种机制,允许类的用户控制释放资源的时间,但需要确保执行Dispose()。
一般情况下,最好的方法是执行这两种机制,获得这两种机制的优点,克服其缺点。假定大多数程序员都能正确调用Dispose(),实现IDisposable接口,同时把析构函数作为一种安全的机制,以防没有调用Dispose()。下面是一个双重实现的例子:
public class ResourceHolder : IDisposable
{
private bool isDispose = false;
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
protected virtual void Dispose(bool disposing)
{
if (!isDisposed)
{
if (disposing)
{
// Cleanup managed objects by calling their Dispose() methods.
}
// Cleanup unmanaged objects
}
isDisposed=true;
}
~ResourceHolder()
{
Dispose (false);
}
}
可以看出,Dispose()有第二个protected重载方法,它带一个bool参数,这是真正完成清理工作的方法。Dispose(bool)由析构函数和IDisposable.Dispose()调用。这个方式的重点是确保所有的清理代码都放在一个地方。
传递给Dispose(bool)的参数表示Dispose(bool)是由析构函数调用,还是由IDisposable.Dispose()调用——Dispose(bool)不应从代码的其他地方调用,其原因是:
● 如果客户调用IDisposable.Dispose(),该客户就指定应清理所有与该对象相关的资源,包括托管和非托管的资源。
● 如果调用了析构函数,在原则上,所有的资源仍需要清理。但是在这种情况下,析构函数必须由垃圾收集器调用,而且不应访问其他托管的对象,因为我们不再能确定它们的状态了。在这种情况下,最好清理已知的未托管资源,希望引用的托管对象还有析构函数,执行自己的清理过程。
isDispose成员变量表示对象是否已被删除,并允许确保不多次删除成员变量。这个简单的方法不是线程安全的,需要调用者确保在同一时刻只有一个线程调用方法。要求客户进行同步是一个合理的假定,在整个.NET类库中反复使用了这个假定(例如在集合类中)。
最后,IDisposable.Dispose()包含一个对System.GC. SuppressFinalize()方法的调用。SuppressFinalize()方法则告诉垃圾收集器有一个类不再需要调用其析构函数了。因为Dispose()已经完成了所有需要的清理工作,所以析构函数不需要做任何工作。调用SuppressFinalize()就意味着垃圾收集器认为这个对象根本没有析构函数。 |
7.3 不安全的代码
如前面的章节所述,C#非常擅长于隐藏基本内存管理,因为它使用了垃圾收集器和引用。但是,有时需要直接访问内存,例如由于性能问题,要在外部(非.NET环境)的DLL中访问一个函数,该函数需要把一个指针当作参数来传递(许多Windows API函数就是这样)。本节将论述C#直接访问内存内容的功能。
7.3.1 指针
下面把指针当作一个新论题来介绍,而实际上,指针并不是新东西,因为在代码中可以自由使用引用,而引用就是一个类型安全的指针。前面已经介绍了表示对象和数组的变量实际上包含存储相应数据(引用)的内存地址。指针只是一个以与引用相同的方式存储数据的变量。其区别是C#的引用语法不允许直接访问引用变量包含的地址。有了引用后,从语法上看,变量就可以存储引用的实际内容。
C#引用主要用于使C#语言易于使用,防止用户无意中执行某些破坏内存中内容的操作,另一方面,使用指针,就可以访问实际内存地址,执行新类型的操作。例如,可以给地址加上4B,这样就可以查看甚至修改存储在新地址中的数据。
下面是使用指针的两个主要原因:
● 向后兼容性。尽管.NET运行库提供了许多工具,但仍可以调用旧的Windows API 函数。 对于某些操作来说,这可能是完成任务的惟一方式。这些API函数都是用C语言编写的,通常要求把指针作为其参数。但在许多情况下,还可以使用DllImport声明,以避免使用指针,例如使用System.IntPtr类。
● 性能。在一些情况下,速度是最重要的,而指针可以提供最优性能。假定用户知道自己在做什么,就可以确保以最高效的方式访问或处理数据。但是,注意在代码的其他区域中,不使用指针,也可以对性能做必要的改进。请使用代码配置文件,查找代码中的瓶颈,代码配置文件随VS.NET一起安装。
但是,这种低级内存访问也是有代价的。使用指针的语法比引用类型更复杂。而且,指针使用起来比较困难,需要非常高的编程技巧和很强的能力,仔细考虑代码所完成的逻辑操作,才能成功地使用指针。如果不仔细,使用指针很容易在程序中引入微妙的、难以查找的错误。例如很容易重写其他变量,导致堆栈溢出,访问某些没有存储变量的内存区域,甚至重写.NET运行库所需要的代码信息,因而使程序崩溃。
另外,如果使用指针就必须为代码获取代码访问安全机制的高级别信任,否则就不能执行。在默认的代码访问安全策略中,只有代码运行在本地机器上,这才是可能的。如果代码必须运行在远程地点,例如Internet,用户就必须给代码授予额外的许可,代码才能工作。除非用户信任您和你的代码,否则他们不会授予这些许可。
尽管有这些问题,但指针在编写高效的代码时是一种非常强大和灵活的工具,这里就介绍指针的使用。
注意:
这里强烈建议不要使用指针,因为如果使用指针,代码不仅难以编写和调试,而且无法通过CLR的内存类型安全检查。
1. 编写不安全的代码
因为使用指针会带来相关的风险,所以C#只允许在特别标记的代码块中使用指针。标记代码所用的关键字是unsafe。下面的代码把一个方法标记为unsafe:
unsafe int GetSomeNumber()
{
// code that can use pointers
}
任何方法都可以标记为unsafe—— 无论该方法是否应用了其他修饰符(例如,静态方法、虚拟方法等)。在这种方法中,unsafe修饰符还会应用到方法的参数上,允许把指针用作参数。还可以把整个类或结构标记为unsafe,表示所有的成员都是不安全的:
unsafe class MyClass
{
// any method in this class can now use pointers
}
同样,可以把成员标记为unsafe:
class MyClass
{
unsafe int *pX; // declaration of a pointer field in a class
}
也可以把方法中的一个代码块标记为unsafe:
void MyMethod()
{
// code that doesn't use pointers
unsafe
{
// unsafe code that uses pointers here
}
// more 'safe' code that doesn't use pointers
}
但要注意,不能把局部变量本身标记为unsafe:
int MyMethod()
{
unsafe int *pX; // WRONG
}
如果要使用不安全的局部变量,就需要在方法或不安全的语句块中声明和使用它。在使用指针前还有一步要完成。C#编译器会拒绝不安全的代码,除非告诉编译器代码包含不安全的代码块。标记所用的关键字是unsafe。因此,要编译包含不安全代码的文件MySource.cs(假定没有其他编译器选项),就要使用下述命令:
csc /unsafe MySource.cs
或者
csc –unsafe MySource.cs
注意:
如果使用Visual Studio .NET,就可以在项目属性中找到编译不安全代码的选项。对于本节中可下载示例的Visual Studio .NET版本,我们已经设置了不安全编译选项。
2. 指针的语法
把代码块标记为unsafe后,就可以使用下面的语法声明指针:
int* pWidth, pHeight;
double* pResult;
byte*[] pFlags;
这段代码声明了4个变量,pWidth和pHeight是整数指针,pResult是double型指针,pFlags是byte型的指针数组。我们常常在指针变量名的前面使用前缀p来表示这些变量是指针。在变量声明中,符号*表示声明一个指针,换言之,就是存储特定类型的变量的地址。
提示:
C++开发人员应注意,这个语法与C#中的语法是不同的。C#语句中 int* pX, pY; 对应于C++ 语句中的 int *pX, *pY;在C#中,*符号与类型相关,而不是与变量名相关。
声明了指针类型的变量后,就可以用与一般变量的方式使用它们,但首先需要学习另外两个运算符:
● & 表示“取地址”,并把一个值数据类型转换为指针,例如int转换为*int。这个运算符称为寻址运算符。
● * 表示“获取地址的内容”,把一个指针转换为值数据类型(例如,*float转换为float)。这个运算符称为“间接寻址运算符”(有时称为“取消引用运算符”)。
从这些定义中可以看出,&和*的作用是相反的。
注意:
符号&和*也表示按位AND(&)和乘法(*)运算符,那么如何以这种方式使用它们?答案是在实际使用时它们是不会混淆的:用户和编译器总是知道在什么情况下这两个符号有什么含义,因为按照新指针的定义,这些符号总是以一元运算符的形式出现—— 它们只作用于一个变量,并出现在代码中变量的前面。另一方面,按位AND和乘法运算符是二元运算符,它们需要两个变量。
下面的代码说明了如何使用这些运算符:
int x = 10;
int* pX, pY;
pX = &x;
pY = pX;
*pY = 20;
首先声明一个整数x,接着声明两个整数指针pX和pY。然后把pX设置为指向x(换言之,把pX的内容设置为x的地址)。把pX的值赋予pY,所以pY也指向x。最后,在语句*pY = 20中,把值20赋予pY指向的地址。实际上是把x的内容改为20,因为pY指向x。注意在这里,变量pY和x之间没有任何关系。只是此时pY碰巧指向存储x的存储单元而已。
要进一步理解这个过程,假定x存储在堆栈的存储单元0x12F8C4到0x12F8C7中(十进制就是1243332到1243335,即有4B,因为int占用4B)。因为堆栈向下分配内存,所以变量pX存储在0x12F8C0到 0x12F8C3的位置上,pY存储在0x12F8BC 到 0x12F8BF的位置上。注意,pX和pY也分别占用4B。这不是因为int占用4B,而是因为在32位处理器上,需要用4B存储一个地址。利用这些地址,在执行完上述代码后,堆栈应如图7-5所示。
图 7-5
注意:
这个示例使用的是int来说明该过程,其中int存储在32位处理器中堆栈的连续空间上,但并不是所有的数据类型都会存储在连续的空间中。原因是32位处理器最擅长于在4B的内存块中获取数据。这种机器上的内存会分解为4字节的块,在Windows上,每个块都时常称为DWORD,因为这是32位无符号int在.NET出现之前的名字。这是从内存中获取DWORD的最高效的方式—— 跨越DWORD边界存储数据通常会降低硬件的性能。因此,.NET运行库通常会给某些数据类型加上一些空间,使它们占用的内存是4B的倍数。例如,short数据占用2B,但如果把一个short放在堆栈中,堆栈指针仍会减少4,而不是2,这样,下一个存储在堆栈中的变量就仍从DWORD的边界开始存储。
可以把指针声明为任意一种数据类型—— 即任何预定义的数据类型uint、int和byte等,也可以声明为一个结构。但是不能把指针声明为一个类或数组,这是因为这么做会使垃圾收集器出现问题。为了正常工作,垃圾收集器需要知道在堆上创建了什么类实例,它们在什么地方。但如果代码使用指针处理类,将很容易破坏堆中.NET运行库为垃圾收集器维护的、与类相关的信息。在这里,垃圾收集器可以访问的数据类型称为托管类型,而指针只能声明为非托管类型,因为垃圾收集器不能处理它们。 |
3. 将指针转换为整数类型
由于指针实际上存储了一个表示地址的整数,所以任何指针中的地址都可以转换为任何整数类型。指针到整数类型的转换必须是显式指定的,隐式的转换是不允许的。例如,编写下面的代码是合法的:
int x = 10;
int* pX, pY;
pX = &x;
pY = pX;
*pY = 20;
uint y = (uint)pX;
int* pD = (int*)y;
把指针pX中包含的地址转换为一个uint,存储在变量y中。接着把y转换回int*,存储在新变量pD中。因此pD也指向x的值。
把指针的值转换为整数类型的主要原因是为了显示它。Console.Write()和Console. WriteLine()方法没有任何带指针的重载方法,所以必须把指针转换为整数类型,才能接受和显示它们:
Console.WriteLine("Address is" + pX); // wrong – will give a
// compilation error
Console.WriteLine("Address is" + (uint) pX); // OK
可以把一个指针转换为任何整数类型,但是,因为在32位系统上,地址占用4B,把指针转换为不是uint、long 或 ulong的数据类型,肯定会导致溢出错误(int也可能导致这个问题,因为它的取值范围是–20亿到20亿,而地址的取值范围是0到40亿)。C#是用于64位处理器的,地址占用8B。因此在这样的系统上,把指针转换为非ulong的类型,就可能导致溢出错误。还要注意,checked关键字不能用于涉及指针的转换。对于这种转换,即使在checked情况下,发生溢出时也不会抛出异常。.NET运行库假定,如果要使用指针,就知道自己要做什么,并希望出现溢出。
4. 指针类型之间的转换
也可以在指向不同类型的指针之间进行显式的转换。例如:
byte aByte = 8;
byte* pByte= &aByte;
double* pDouble = (double*)pByte;
这是一段合法的代码,但如果要执行这段代码,就要小心了。在上面的示例中,如果要查找指针pDouble指向的double,就会查找包含1B的内存,并和一些其他内存合并在一起,把它当作包含一个double的内存区域来对待—— 这不会得到一个有意义的值。但是,可以在类型之间转换,实现类型的统一,或者把指针转换为其他类型,例如把指针转换为sbyte,检查内存的单个字节。
5. void指针
如果要使用一个指针,但不希望指定它指向的数据类型,就可以把指针声明为void:
int* pointerToInt;
void* pointerToVoid;
pointerToVoid = (void*)pointerToInt;
void型指针的主要用途是调用需要void*型参数的API函数。在C#语言中,使用void指针的情况并不是很多。特殊情况下,如果试图使用*运算符间接引用void指针,编译器就会标记一个错误。
6. 指针的算法
可以给指针加减整数。但是,编译器很智能,知道如何执行这个操作。例如,假定有一个int指针,要在其值上加1。编译器会假定要查找int后面的存储单元,因此会给该值加上4B, 即加上int的字节数。如果这是一个double指针,加1就表示在指针的值上加8B,即double的字节数。只有指针是指向byte或 sbyte(都是1B),才会给该指针的值加上1。
可以对指针使用运算符+、–、+=、–=、++和–– ,这些运算符右边的变量必须是long或ulong类型。
注意:
不允许针对void指针执行算术运算。
例如,假定有如下定义:
uint u = 3;
byte b = 8;
double d = 10.0;
uint* pUint= &u; // size of a uint is 4
byte* pByte = &b; // size of a byte is 1
double* pDouble = &d; // size of a double is 8
下面假定这些指针的地址是:
● pUint:1243332
● pByte: 1243328
● pDouble: 1243320
执行这段代码后:
++pUint; // adds (1*4)= 4 bytes to pUint
pByte–= 3; // subtracts (3*1)=3 bytes from pByte
double* pDouble2 = pDouble + 4; // pDouble2 = pDouble + 32 bytes (4*8 bytes)
指针应包含的内容是:
● pUint: 1243336
● pByte: 1243325
● pDouble2: 1243352
提示:
给类型为T的指针加上X,其中X的值为P,则得到的结果是P + X*(sizeof(T))。
注意:
使用这个规则时要小心。如果给定类型的连续值存储在连续的存储单元中,指针加法就允许在存储单元中移动。但如果类型是byte或char,其总字节数就不是4的倍数,在默认情况下,连续值就不是默认地存储在连续的存储单元中。
如果两个指针都指向相同的数据类型,也可以把一个指针从另一个指针中减去。此时,结果是一个long,其值是指针值的差被该数据类型所占用的字节数整除的结果:
double* pD1 = (double*)1243324; // note that it is perfectly valid to
// initialize a pointer like this.
double* pD2 = (double*)1243300;
long L = pD1-pD2; // gives the result 3 (=24/sizeof(double)) |
C#中的内存管理和指针(四)
7. sizeof运算符
在这一节中,将介绍如何确定各种数据类型的大小。如果需要在代码中使用类型的大小,就可以使用sizeof运算符,它的参数是数据类型的名称,返回该类型占用的字节数。例如:
int x = sizeof(double);
这将设置x的值为8。
使用sizeof的优点是不必在代码中硬编码数据类型的大小,使代码的移植性更强。对于预定义的数据类型,sizeof返回表7-1所示的值。
表 7-1
sizeof(sbyte) = 1; sizeof(byte) = 1;
sizeof(short) = 2; sizeof(ushort) = 2;
sizeof(int) = 4; sizeof(uint) = 4;
sizeof(long) = 8; sizeof(ulong) = 8;
sizeof(char) = 2; sizeof(float) = 4;
sizeof(double) = 8; sizeof(bool) = 1;
也可以对自己定义的结构使用sizeof,但此时得到的结果取决于结构中的字段。不能对类使用sizeof。它只能用于不安全的代码块。
8. 结构指针:指针成员访问运算符
结构指针的工作方式与预定义值类型的指针的工作方式是一样的。但是这有一个条件:结构不能包含任何引用类型,这是因为前面介绍的一个限制—— 指针不能指向任何引用类型。为了避免这种情况,如果创建一个指针,它指向包含引用类型的结构,编译器就会标记一个错误。
假定定义了如下结构:
struct MyStruct
{
public long X;
public float F;
}
就可以给它定义一个指针:
MyStruct* pStruct;
对其进行初始化:
MyStruct Struct = new MyStruct();
pStruct = &Struct;
也可以通过指针访问结构的成员值:
(*pStruct).X = 4;
(*pStruct).F = 3.4f;
但是,这个语法有点复杂。因此,C#定义了另一个运算符,用一种比较简单的语法,通过指针访问结构的成员,该语法称为指针成员访问运算符,其符号是一个短划线,后跟一个大于号:–>。
注意:
C++开发人员会认出指针成员访问操作符。因为C++使用这些符号完成相同的任务。
使用这个指针成员访问运算符,上述代码可以重写为:
pStruct–>X = 4;
pStruct–>F = 3.4f;
也可以直接把合适类型的指针设置为指向结构中的一个字段:
long* pL = &(Struct.X);
float* pF = &(Struct.F);
或者
long* pL = &(pStruct–>X);
float* pF = &(pStruct–>F);
9. 类成员指针
前面说过,不能创建指向类的指针,这是因为垃圾收集器不包含指针的任何信息,只包含引用的信息,因此创建指向类的指针会使垃圾收集器不能正常工作。
但是,大多数类都包含值类型的成员,可以为这些值类型成员创建指针,但这需要一种特殊的语法。例如,假定把上面示例中的结构重写为类:
class MyClass
{
public long X;
public float F;
}
然后就可以为它的字段X和F创建指针了,方法与前面一样。但这么做会抛出一个编译 错误:
MyClass myObject = new MyClass();
long* pL = &( myObject.X); // wrong– –compilation error
float* pF = &( myObject.F); // wrong– –compilation error
X和F本身都是非托管类型,它们嵌入在一个对象中,存储在堆上。在垃圾收集的过程中,垃圾收集器会把MyClass移动到内存的一个新单元上,这样, pL和pF就会指向错误的存储单元。由于存在这个问题,所以编译器不允许以这种方式把托管类型的成员地址分配给指针。
解决这个问题的方法是使用fixed关键字,它会告诉垃圾收集器,类实例的某些成员有指向它们的指针,所以这些实例不能移动。如果要声明一个指针,使用fixed的语法如下所示:
MyClass myObject = new MyClass();
fixed (long* pObject = &( myObject.X))
{
// do something
}
在关键字fixed后面的圆括号中,定义和初始化指针变量。这个指针变量(在本例中是pObject)现在就在fixed块的作用域内,这样,垃圾收集器知道,在执行fixed块中的代码时,不能移动MyObject对象。
如果要声明多个这样的指针,可以在同一个代码块前放置多个fixed语句:
MyClass myObject = new MyClass();
fixed (long* pX = &( myObject.X))
fixed (float* pF = &( myObject.F))
{
// do something
}
如果要在不同的阶段固定几个指针,还可以嵌套整个fixed块:
MyClass myObject = new MyClass();
fixed (long* pX = &( myObject.X))
{
// do something with pX
fixed (float* pF = &( myObject.F))
{
// do something else with pF
}
}
也可以在同一个fixed语句中初始化多个变量,但这些变量的类型必须相同:
MyClass myObject = new MyClass();
MyClass myObject2 = new MyClass();
fixed (long* pX = &( myObject.X), pX2 = &( myObject2.X))
{
// etc.
在上述情况中,是否声明不同的指针,让它们指向相同或不同对象中的字段,或者指向不与类实例相关的静态字段,这一点是不重要的。
10. 指针示例PointerPlayaround
下面给出一个使用指针的示例:PointerPlayaround。它执行一些简单的指针操作,显示结果,还允许查看内存中发生的情况,并确定变量存储在什么地方:
using System;
namespace Wrox.ProCSharp.Chapter07
{
class MainEntryPoint
{
static unsafe void Main()
{
int x=10;
short y =–1;
byte y2 = 4;
double z = 1.5;
int* pX = &x;
short* pY = &y;
double* pZ = &z;
Console.WriteLine(
"Address of x is 0x{0:X}, size is {1}, value is {2}",
(uint)&x, sizeof(int), x);
Console.WriteLine(
"Address of y is 0x{0:X}, size is {1}, value is {2}",
(uint)&y, sizeof(short), y);
Console.WriteLine(
"Address of y2 is 0x{0:X}, size is {1}, value is {2}",
(uint)&y2, sizeof(byte), y2);
Console.WriteLine(
"Address of z is 0x{0:X}, size is {1}, value is {2}",
(uint)&z, sizeof(double), z);
Console.WriteLine(
"Address of pX=&x is 0x{0:X}, size is {1}, value is 0x{2:X}",
(uint)&pX, sizeof(int*), (uint)pX);
Console.WriteLine(
"Address of pY=&y is 0x{0:X}, size is {1}, value is 0x{2:X}",
(uint)&pY, sizeof(short*), (uint)pY);
Console.WriteLine(
"Address of pZ=&z is 0x{0:X}, size is {1}, value is 0x{2:X}",
(uint)&pZ, sizeof(double*), (uint)pZ);
*pX = 20;
Console.WriteLine("After setting *pX, x = {0}", x);
Console.WriteLine("*pX = {0}", *pX);
pZ = (double*)pX;
Console.WriteLine("x treated as a double = {0}", *pZ);
Console.ReadLine();
}
}
}
这段代码声明了3个值变量:
● int x
● short y
● double z
还声明了指向这3个值的指针:px、py、pz。
然后显示这3个变量的值,以及它们的大小和地址。注意在获取px, py和pz的地址时,我们查看的是指针的指针,即值的地址的地址!还要注意,与显示地址的常见方式一致,在Console.WriteLine()命令中使用{0:X}格式说明符,确保该内存地址以16进制格式显示。
最后,使用指针px把x的值改为20,执行一些指针转换,如果把x的内容当作double类型,就会得到无意义的结果。
编译运行这段代码,在得到的结果中,我们将列出用/unsafe标志进行编译和不用/unsafe标志进行编译的结果:
csc PointerPlayaround.cs
Microsoft (R) Visual C# .NET Compiler version 7.10.3052.4
for Microsoft (R) .NET Framework version 1.1.4322
Copyright (C) Microsoft Corporation 2001–2002. All rights reserved.
PointerPlayaround.cs(7,26): error CS0227: Unsafe code may only appear if
compiling with /unsafe
csc /unsafe PointerPlayaround.cs
Microsoft (R) Visual C# .NET Compiler version 7.10.3052.4
for Microsoft (R) .NET Framework version 1.1.4322
Copyright (C) Microsoft Corporation 2001-2002. All rights reserved.
PointerPlayaround
Address of x is 0x12F8C4, size is 4, value is 10
Address of y is 0x12F8C0, size is 2, value is -1
Address of y2 is 0x12F8BC, size is 1, value is 4
Address of z is 0x12F8B4, size is 8, value is 1.5
Address of pX=&x is 0x12F8B0, size is 4, value is 0x12F8C4
Address of pY=&y is 0x12F8AC, size is 4, value is 0x12F8C0
Address of pZ=&z is 0x12F8A8, size is 4, value is 0x12F8B4
After setting *pX, x = 20
*pX = 20
x treated as a double = 2.63837073472194E-308
检查这3个结果,可以证实我们在本文前面的“后台内存管理”一节描述的堆栈操作,即堆栈给变量向下分配内存。注意,这还证实了堆栈中的内存块总是按照4B的倍数进行分配的。例如,y是一个short(size = 2),其地址是1243328,表示为该变量分配的内存区域是1243328~1243331。如果.NET运行库严格逐个排列变量,则y应只占用2个存储单元12433328和1243329。 |
11. 给示例添加类和结构
在本节中,使用第二个示例PointerPlayaround2介绍指针的算法,以及结构指针和类成员指针。开始时,定义一个结构CurrencyStruct,把货币值表示为美元和美分,再定义一个对应的类CurrencyClass:
struct CurrencyStruct
{
public long Dollars;
public byte Cents;
public override string ToString()
{
return "$" + Dollars + "." + Cents;
}
}
class CurrencyClass
{
public long Dollars;
public byte Cents;
public override string ToString()
{
return "$" + Dollars + "." + Cents;
}
}
定义好了结构和类后,就可以对它们应用指针了。下面的代码是一个新的示例。这段代码比较长,我们对此将做详细讲解。首先显示CurrencyStruct结构的字节数,创建它的两个实例和一些指针,再使用pAmount指针初始化一个CurrencyStruct结构amount1,显示变量的 地址:
public static unsafe void Main()
{
Console.WriteLine(
"Size of Currency struct is " + sizeof(CurrencyStruct));
CurrencyStruct amount1, amount2;
CurrencyStruct* pAmount = &amount1;
long* pDollars = &(pAmount->Dollars);
byte* pCents = &(pAmount->Cents);
Console.WriteLine("Address of amount1 is 0x{0:X}", (uint)&amount1);
Console.WriteLine("Address of amount2 is 0x{0:X}", (uint)&amount2);
Console.WriteLine("Address of pAmount is 0x{0:X}", (uint)&pAmount);
Console.WriteLine("Address of pDollars is 0x{0:X}", (uint)&pDollars);
Console.WriteLine("Address of pCents is 0x{0:X}", (uint)&pCents);
pAmount–>Dollars = 20;
*pCents = 50;
Console.WriteLine("amount1 contains " + amount1);
现在根据堆栈的工作方式,执行一些指针操作。由于变量是按顺序声明的,所以amount2存储在amount1后面紧邻的地址上,sizeof(CurrencyStruct)返回16(见后面的的屏幕输出),所以CurrencyStruct占用的字节数是4的倍数。在递减了Currency指针后,它就指向amount2:
–– pAmount; // this should get it to point to amount2
Console.WriteLine("amount2 has address 0x{0:X} and contains {1}",
(uint)pAmount, *pAmount);
在调用Console.WriteLine()语句时,它显示了amount2的内容,但还没有对它进行初始化。显示出来的东西就是随机的垃圾—— 在执行该示例前存储在内存中该单元的内容。但这有一个要点:一般情况下,C#编译器会禁止使用未初始化的值,但在开始使用指针时,就很容易绕过所有通常的编译检查。此时我们这么做,是因为编译器无法知道我们实际上要显示的是amount2的内容。因为知道了堆栈的工作方式,所以可以说出递减pAmount的结果是什么。使用指针算法后,可以访问各种编译器通常禁止访问的变量和存储单元,因此指针算法是不安全的。
在示例中,接下来在pCents指针上进行指针运算。pCents目前指向amount1.Cents,但此处的目的是使用指针算法让它指向amount2.Cents,而不是直接告诉编译器我们要做什么。为此,需要从pCents指针所包含的地址中减去sizeof(Currency):
// do some clever casting to get pCents to point to cents
// inside amount2
CurrencyStruct* pTempCurrency = (CurrencyStruct*)pCents;
pCents = (byte*) (–– pTempCurrency );
Console.WriteLine("Address of pCents is now 0x{0:X}", (uint)&pCents);
最后,使用fixed关键字创建一些指向类实例中字段的指针,使用这些指针设置这个实例的值。注意,这也是我们第一次能够查看存储在堆中(而不是堆栈)的项目地址:
Console.WriteLine("nNow with classes");
// now try it out with classes
CurrencyClass amount3 = new CurrencyClass();
fixed(long* pDollars2 = &(amount3.Dollars))
fixed(byte* pCents2 = &(amount3.Cents))
{
Console.WriteLine(
"amount3.Dollars has address 0x{0:X}", (uint)pDollars2);
Console.WriteLine(
"amount3.Cents has address 0x{0:X}", (uint) pCents2);
*pDollars2 = -100;
Console.WriteLine("amount3 contains " + amount3);
}
编译并运行这段代码,得到如下所示的结果:
csc /unsafe PointerPlayaround2.cs
Microsoft (R) Visual C# .NET Compiler version 7.10.3052.4
for Microsoft (R) .NET Framework version 1.1.4322
Copyright (C) Microsoft Corporation 2001–2002. All rights reserved.
PointerPlayaround2
Size of Currency struct is 16
Address of amount1 is 0x12F698
Address of amount2 is 0x12F688
Address of pAmount is 0x12F684
Address of pDollars is 0x12F680
Address of pCents is 0x12F67C
amount1 contains $20.50
amount2 has address 0x12F688 and contains $0.236
Address of pCents is now 0x12F67C
Now with classes
amount3.Dollars has address 0xB8850C
amount3.Cents has address 0x4B88514
amount3 contains $–100.0
注意:
这些结果是使用.NET Framework 1.1版本得到的。如果在.NET的另一个版本上运行该例子,实际显示的地址会有所不同。
注意在这个结果中,显示了未初始化的amount2值,CurrencyStruct结构的字节数是16,大于其字段的字节数(1 long(=8) + 1 byte(=1))。这是前面讨论的对齐单词的结果。 |
C#中的内存管理和指针(五)
7.3.2 使用指针优化性能
前面用许多篇幅介绍了使用指针可以完成的各种任务,但在前面的示例中,仅是处理内存,让有兴趣的人们了解底层发生了什么事,并没有帮助人们编写出好的代码!本节将应用我们对指针的理解,用一个示例来说明使用指针可以大大提高性能。
1. 创建基于堆栈的数组
本节将介绍指针的一个主要应用领域:在堆栈中创建高性能、低系统开销的数组。C#很容易使用一维数组和矩形或锯齿形多维数组,但有一个缺点:这些数组实际上都是对象,是System.Array的实例。因此数组只能存储在堆上,会增加系统开销。有时,我们希望创建一个使用时间比较短的高性能数组,不希望有引用对象的系统开销。而使用指针就可以做到,但只能用于一维数组。
为了创建一个高性能的数组,需要使用另一个关键字:stackalloc。stackalloc命令指示.NET运行库分配堆栈上一定量的内存。在调用它时,需要为它提供两条信息:
● 要存储的数据类型
● 需要存储的数据个数。
例如,分配足够的内存,以存储10个decimal数据,可以编写下面的代码:
decimal* pDecimals = stackalloc decimal [10];
注意,这个命令只是分配堆栈内存而已。它不会试图把内存初始化为任何默认值,这正好符合我们的目的。因为这是一个高性能的数组,给它不必要地初始化值会降低性能。
同样,要存储20个double数据,可以编写下面的代码:
double* pDoubles = stackalloc double [20];
虽然这行代码指定把变量的个数存储为一个常数,但它是在运行时计算的一个数字。所以可以把上面的示例写为:
int size;
size = 20; // or some other value calculated at run-time
double* pDoubles = stackalloc double [size];
从这些代码段中可以看出,stackalloc的语法有点不寻常。它的后面紧跟的是要存储的数据类型名(该数据类型必须是一个值类型),其后是把需要的变量个数放在方括号中。分配的字节数是变量个数乘以sizeof(数据类型)。在这里,使用方括号表示这是一个数组。如果给20个double数据分配存储单元,就得到了一个有20个元素的double数组,最简单的数组类型可以是:逐个存储元素的内存块,如图7-6所示。
图 7-6
在图7-6中,显示了一个由stackalloc返回的指针,stackalloc总是返回分配数据类型的指针,它指向新分配内存块的顶部。要使用这个内存块,可以取消对返回指针的引用。例如,给20个double数据分配内存后,把第一个元素(数组中的元素0)设置为3.0,可以编写下面的代码:
double* pDoubles = stackalloc double [20];
*pDoubles = 3.0;
要访问数组的下一个元素,可以使用指针算法。如前所述,如果给一个指针加1,它的值就会增加其数据类型的字节数。在本例中,就会把指针指向下一个空闲存储单元。因此可以把数组的第二个元素(数组中元素号为1)设置为8.4:
double* pDoubles = stackalloc double [20];
*pDoubles = 3.0;
*(pDoubles+1) = 8.4;
同样,可以用表达式*(pDoubles+X)获得数组中下标为X的元素。
这样,就得到一种访问数组中元素的方式,但对于一般目的,使用这种语法过于复杂。C#为此定义了另一种语法。对指针应用方括号时,C#为方括号提供了一种非常明确的含义。如果变量p是任意指针类型,X是一个整数,表达式p[X]就被编译器解释为*(p+X),这适用于所有的指针,不仅仅是用stackalloc初始化的指针。利用这个简捷的记号,就可以用一种非常方便的方式访问数组。实际上,访问基于堆栈的一维数组所使用的语法与访问基于堆的、由System.Array类表示的数组是一样的:
double *pDoubles = stackalloc double [20];
pDoubles[0] = 3.0; // pDoubles[0] is the same as *pDoubles
pDoubles[1] = 8.4; // pDoubles[1] is the same as *(pDoubles+1)
注意:
把数组的语法应用于指针并不是新东西。自从开发出C和C++语言以来,它们就是这两种语言的基础部分。实际上,C++开发人员会把这里用stackalloc获得的、基于堆栈的数组完全等同于传统的基于堆栈的C和C++数组。这个语法和指针与数组的链接方式是C语言在70年代后期流行起来的原因之一,也是指针的使用成为C和C++中一种大众化编程技巧的主要原因。
高性能的数组可以用与一般C#数组相同的方式访问,但需要强调其中的一个警告。在C#中,下面的代码会抛出一个异常:
double [] myDoubleArray = new double [20];
myDoubleArray[50] = 3.0;
抛出异常的原因很明显。使用越界的下标来访问数组:下标是50,但允许的最大值是19。但是,如果使用stackalloc声明了一个相同数组,对数组进行边界检查时,这个数组中没有包装任何对象,因此下面的代码不会抛出异常:
double* pDoubles = stackalloc double [20];
pDoubles[50] = 3.0;
在这段代码中,我们分配了足够的内存来存储20个double类型数据。接着把sizeof(double)存储单元的起始位置设置为该存储单元的起始位置加上50*sizeof(double)存储单元,来保存双精度值3.0。但这个存储单元超出了刚才为double分配的内存区域。谁也不知道这个地址上存储了什么数据。最好是只使用某个当前未使用的内存,但所重写的空间也有可能是堆栈上用于存储其他变量或某个正在执行的方法的返回地址。因此,使用指针获得高性能的同时,也会付出一些代价:需要确保自己知道在做什么,否则就会抛出非常古怪的运行时错误。
2. 示例QuickArray
下面用一个stackalloc示例QuickArray来结束关于指针的讨论。在这个示例中,程序仅要求用户提供为数组分配的元素数。然后代码使用stackalloc给long型数组分配一定的存储单元。这个数组的元素是从0开始的整数的平方,结果显示在控制台上:
using System;
namespace Wrox.ProCSharp.Chapter07
{
class MainEntryPoint
{
static unsafe void Main()
{
Console.Write("How big an array do you want? n> ");
string userInput = Console.ReadLine();
uint size = uint.Parse(userInput);
long* pArray = stackalloc long [(int)size];
for (int i=0 ; i<size ; i++)
pArray[i] = i*i;
for (int i=0 ; i<size ; i++)
Console.WriteLine("Element {0} = {1}", i, *(pArray+i));
}
}
}
运行这个示例,得到如下所示的结果:
QuickArray
How big an array do you want?
> 15
Element 0 = 0
Element 1 = 1
Element 2 = 4
Element 3 = 9
Element 4 = 16
Element 5 = 25
Element 6 = 36
Element 7 = 49
Element 8 = 64
Element 9 = 81
Element 10 = 100
Element 11 = 121
Element 12 = 144
Element 13 = 169
Element 14 = 196
7.4 小结
要想成为真正优秀的C#程序员,必须牢固掌握存储单元和垃圾收集的工作原理。本文描述了CLR管理以及在堆和堆栈上分配内存的方式,讨论了如何编写正确释放未托管资源的类,并介绍如何在C#中使用指针,这些都是很难理解的高级主题,初学者常常不能正确实现。
本文摘至清华大学出版社出版的Wrox红皮书《C#高级编程(第3版)》,转载必须标明出处 |
LZ辛苦 C# 不是C/C++吧! |
| |