使用序列容器提高型別安全性和安全性

評分: 3.2/5 (13 票)

組織結構
第一部分) 將陣列傳遞給函式

第二部分) 從函式返回陣列

如果您在理解任何示例時遇到困難，請考慮回顧陣列和模板教程。對於模板教程，您只需要閱讀關於函式模板的第一部分。

http://www.cplusplus.com/doc/tutorial/arrays/
http://www.cplusplus.com/doc/tutorial/templates/
術語
使用“陣列”一詞會立即導致一些原因的混淆。

1) 其他語言內建了智慧陣列型別，其工作方式與 C/C++ 陣列不同。術語“陣列”在許多字典中都有定義，因此存在一個數組的廣義概念，這在討論 C++ 或其他語言定義的特定型別的陣列時會導致混淆。

2) C++ 標準將 std::vector 描述為序列容器，但 C++ 程式設計師通常稱 std::vector 為動態陣列。事實上，任何提供隨機訪問的標準序列容器都可以放入“陣列”一詞的更一般的定義中。

例如，考慮這些定義：
dictionary.com
計算機。一組相關的 [相關] 資料元素，每個元素通常由一個或多個下標標識。

Merriam Webster
(1): 按行和列排列的若干數學元素 (2): 一種資料結構，其中相似的資料元素排列在表中 b: 一系列按類別大小順序排列的統計資料

當我使用“陣列”一詞時，我指的是字典中更通用的定義。當引用 C++ 標準 8.3.4 節描述的“資料結構”時，我將使用“C 陣列”一詞。以下示例展示了一個 C 陣列的示例。這種資料結構存在於 C 語言中，並且必須得到 C++ 編譯器的支援。我將使用大量示例來解釋為什麼有時最好考慮使用標準序列容器之一。

1
2
3

const int SIZE(5);
int data[SIZE];
std::generate(data, data + SIZE, rand);

第一部分 - 將陣列傳遞給函式
編譯並執行程式。它包含一個缺陷，在您分析輸出時會顯現出來。

#include <iostream>

void printArray(int data[])
{
    for(int i = 0, length = sizeof(data); i < length; ++i)
    {
        std::cout << data[i] << ' ';
    }
    std::cout << std::endl;
}

int main()
{
    int data[] = { 5, 7, 8, 9, 1, 2 };
    printArray(data);
    return 0;
}

您將看到只打印了 C 陣列的前 4 個元素。 sizeof(data) 的呼叫返回值為 4！這恰好是指向列印陣列的指標的大小。這有幾方面的影響。首先，陣列沒有被複制。指向陣列第一個元素的指標被複制了。C 陣列沒有複製建構函式、賦值運算子或功能介面。在以下示例中，您將看到使用 C++ 標準模板庫提供的動態序列容器 std::vector、std::deque 和 std::list 的示例。這不是一個完整的容器教程，但它們被用來展示對現有有缺陷程式的改進的靈活性。

讓我們看另一個例子。在這個例子中，我建立了多個過載的 printArray 函式，以便展示多種解決方案。然後，我將分析每個解決方案，並解釋它們的優缺點。

#include <iostream>
#include <vector>
#include <deque>
#include <list>

// Method 1: works but very little security.  It is impossible to validate
// the inputs since the size of data still cannot be validated. If length is too large
// undefined behavior will occur.
void printArray(int data[], int length)
{
    for(int i(0); i < length; ++i)
    {
        std::cout << data<i> << ' ';
    }
    std::cout << std::endl;
}

// Method 2: Type safe and more generic.  Works with any container that supports forward iterators.
// Limitation - cannot validate iterators so caller could pass null or invalid pointers.  Typesafe - won't
// allow you to pass inconsistent iterator types.  Allows you to pass any valid range of a container.
template <class ForwardIteratorType> 
void printArray(ForwardIteratorType begin, ForwardIteratorType end)
{
    while(begin != end)
    {
        std::cout << *begin << ' ';
        ++begin;
    }
    std::cout << std::endl;
}

// Method 3 - This implementation is as typesafe and secure as you can get but
// does not allow a subrange since the entire container is expected.  It could
// be useful if you want that extra security and know that you want to operate
// on the entire container.
template <class ContainerType> 
void printArray(const ContainerType& container)
{
    ContainerType::const_iterator current(container.begin()), end(container.end());
    for( ; 
        current != end; 
        ++current)
    {
        std::cout << *current << ' ';
    }
    std::cout << std::endl;
}

int main()
{
    // Method 1.
    const int LENGTH(6);
    int data[LENGTH] = { 5, 7, 8, 9, 1, 2 };
    printArray(data, LENGTH);

    // Method 2.
    printArray(data, data + LENGTH);
    std::vector<int> vData(data, data + LENGTH);
    printArray(vData.begin(), vData.end());
    std::list<int> lData(data, data + LENGTH);
    printArray(lData.begin(), lData.end());
    std::deque<int> dData(data, data + LENGTH);
    printArray(dData.begin(), dData.end());
    // won't compile if caller accidentally mixes iterator types.
    //printArray(dData.begin(), vData.end());

    // method 3.
    printArray(vData);
    printArray(dData);
    printArray(lData);
	return 0;
}

方法 2 是獨一無二的，因為它允許您指定陣列的任何範圍，而方法 1 和 2 完成了列印整個容器的相同目標。如果這正是您的意圖，那麼我認為方法 3 是最好的。它最安全，型別最安全。呼叫者指定無效引數的可能性非常小。空容器不會導致任何問題。該函式根本不會列印任何值。

需要注意的是，C 陣列不能透過方法 3 傳遞。方法 3 要求使用容器，例如 std::vector。C 陣列是 C 語言的遺留物，沒有功能介面。如果您處理的是 C 陣列，則需要使用方法 1 或 2。我確定還有其他方法，但這取決於您來確定哪種方法最適合您的專案。

人們可以製作數百個示例程式來進一步證明這些觀點，但我會將它留給讀者複製程式並構建其他型別的示例。模板的美妙之處在於它減少了重複的程式設計任務。定義一次函式，這樣就可以多次呼叫該函式，每次指定不同的型別。這僅僅是確保型別支援函式最低要求的問題。方法 3 的 printArray 函式要求 ContainerType 具有 begin() 和 end() 成員函式，它們返回前向迭代器，並且容器內的物件是支援 operator<< 函式的類的例項。operator<< 也可以為使用者定義的型別定義，因此方法 3 不僅限於內建型別容器。
第二部分 - 從函式返回陣列
下面是一個包含從函式返回陣列的兩個典型問題的示例。根據記錄，我認為從函式返回陣列沒有必要。將函式的結果返回似乎很自然，但並非必要。您可以使用指標或引用透過 out 引數向函式提供資料。

以下程式使用 MS Visual Studio C++ Express 2008 產生此輸出。

13 8 9 10 11 12
-858993460 -858993460 -858993460 -858993460 -858993460 3537572
41 18467 6334 26500 19169 15724
41 18467 6334 26500 19169 15724

#include <algorithm>
#include <iostream>

// Prints out array elements. Method 2 from PART I.
template <class ForwardIteratorType> 
void printArray(ForwardIteratorType begin, ForwardIteratorType end)
{
    while(begin != end)
    {
        std::cout << *begin << ' ';
        ++begin;
    }
    std::cout << std::endl;
}

// This function is a poor design which will lead to undefined behavior when the caller
// tries to use the pointer that is returned.  data is allocated on the stack and destroyed
// after the function returns.  The pointer to the memory is returned but it is a dangling
// pointer to memory that has already been released.
{
    int data[6] = { 13, 8, 9, 10, 11, 12 };
    int* pointer = data;
    printArray(pointer, pointer + 6);
    return pointer;
}

// The *& symbol means reference to a pointer so that modification of the array 
// results in modification of lotteryNumbers back in main.  In this case the pointer
// updated back in main is valid but the caller has to remember to release the memory
// at some point.  Therefore this approach is error prone.
void generateArray(int *&array, int length)
{
    int* pointer = new int[length];
    // std::generate requires the <algorithm> header
    std::generate(pointer, pointer + length, rand);
    printArray(pointer, pointer + length);
    array = pointer;
}

int main()
{
    int* lotteryNumbers = generateArray();
    printArray(lotteryNumbers, lotteryNumbers + 6);

    const int LENGTH(6);
    generateArray(lotteryNumbers, LENGTH);
    printArray(lotteryNumbers, lotteryNumbers + 6);
    delete lotteryNumbers;
    return 0;
}

第一次呼叫 printArray 發生在返回值的 generateArray 版本中。那時，名為 data 的陣列是有效的，並且自從在函式內建立以來，它已從堆疊記憶體中分配。一旦 generateArray 返回，記憶體就會返回到堆疊，供程式用於其他目的。因此，返回到 main 的指標指向可以也將會被覆蓋的記憶體，並且第二行輸出是垃圾。該行為未定義。無法預測此類程式的行為。我所見的輸出可能不是您在使用另一編譯器和/或執行時環境時看到的輸出。

同一個 generateArray 版本還有另一個問題。該函式只能返回一個值。即使陣列是使用堆記憶體正確構造的，main 如何知道陣列的大小？在這種情況下，編寫這兩個函式的程式設計師進行了編碼假設，這是一個糟糕的設計。

請注意，還有一個 generateArray 版本，它接受兩個引數並且返回型別為 void。第一個引數是指向指標的引用，以便 main 的 lotteryNumbers 指標被修改。第二個引數是長度，我要求呼叫者提供。儘管函式可以成功完成任務，但這是最好的方法嗎？在複雜的大型應用程式中，記憶體洩漏可能導致嚴重問題，而您可能很難自己管理記憶體。

我認為我們可以做得更好。一個出現的問題是，為什麼您會想要一個構建陣列的函式？您可以輕鬆地在原地例項化一個數組。讓我建立一個函式來讀取控制檯輸入，併為使用者填充陣列。下面的示例允許函式構建陣列，而呼叫者不必擔心記憶體洩漏或堆疊與堆記憶體分配。有很多方法可以做到這一點。在這種情況下，我選擇允許呼叫者傳遞任何大小的陣列，函式將簡單地向其新增元素。它可以是空的，但不必如此。std::vector 正在管理記憶體，因此當 main 函式退出時，它會被銷燬，而程式設計師不必擔心垃圾回收。

#include <vector>
#include <iostream>
#include <limits>

// Prints out array elements. Method 2 from PART I.
template <class ForwardIteratorType> 
void printArray(ForwardIteratorType begin, ForwardIteratorType end)
{
    while(begin != end)
    {
        std::cout << *begin << ' ';
        ++begin;
    }
    std::cout << std::endl;
}

// The caller must decide whether to pass an empty container.  This function will 
// add to it.  
void readScores(std::vector<int>& container)
{
    std::cout << "Type the list of scores followed by a non-numeric character and press enter when finished. " 
              << "For instance (22 25 26 f <enter> " << std::endl;
    int temp(0);
    while(std::cin >> temp)
    {
        container.push_back(temp);
    }
    // clear and discard any leftover data from the input stream.
    std::cin.clear();
    std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
}

int main()
{
    std::vector<int> scores; // uninitialized.  Let readScores fill it.
    readScores(scores);
    printArray(scores.begin(), scores.end());
    return 0;
}

這次我選擇不使 readScores 成為模板函式。它不必如此，而且我想讓示例保持相當簡單。它可以被修改得更通用。如果您敢於嘗試，並在執行時觀察會發生什麼。關鍵是函式實際上不需要構建陣列。在函式內構建陣列並返回它很棘手。您將不得不處理垃圾回收，或者按值返回 std 容器，這可能導致不必要的複製構造。

不幸的是，按值返回意味著至少您很可能需要一個賦值，該賦值將導致呼叫者的 vector 分配記憶體來儲存複製的資料。最好的方法確實是像我在前面的示例中那樣，透過引用傳遞並具有 void 返回型別。該示例也更靈活，因為呼叫者可以決定是向現有陣列新增元素還是填充新陣列。

std::vector<int> container readScores()
{
    std::vector<int> container;
    std::cout << "Type the list of scores followed by a non-numeric character and press enter when finished. " 
              << "For instance (22 25 26 f <enter> " << std::endl;
    int temp(0);
    while(std::cin >> temp)
    {
        container.push_back(temp);
    }
    // clear and discard any leftover data from the input stream.
    std::cin.clear();
    std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
    
    // return by value. Container will be destroyed but data will be copied into callers vector instance which could result
    // in additional memory allocation.  
    return container;
}

我最後說，還有其他方法可以完成這些型別的程式設計任務，我想鼓勵任何人釋出一些使用模板函式或 boost 庫的示例。