The purpose of this assignment is to help you learn how to
implement common data structures in C and how to exploit them to
achieve modularity in a real-world application. It also will give you
the opportunity to gain more experience with the GNU/Linux programming
tools, especially bash
,
emacs
, and gdb
.
Dynamic resizing of an array in task 1 is "on your own" part of this assignment. This part is worth 15% of this assignment.
A data structure is a way of organizing data for efficient operation. In this assignment, you will implement the required functionalities (register, unregister, and find) of a customer management program using the following data structures.
You will implement and improve the customer data management API using various data structures. Your task in this assignment is threefold:
customer_manager
Interfacecustomer_manager
is an API library for the customer data management, where the users can register the customer information and perform lookup operations to retrieve the purchase amount information.
The customer_manager
interface introduces a structure type definition, DB_T
:
DB_T
structure is used for saving
the pointer(s) to the entire customer data.The customer_manager
interface is described in a file named customer_manager.h
, and it contains these function declarations:
DB_T CreateCustomerDB(void);
void DestroyCustomerDB(DB_T d);
int RegisterCustomer(DB_T d, const char *id, const char *name, const int purchase);
int UnregisterCustomerByID(DB_T d, const char *id);
int UnregisterCustomerByName(DB_T d, const char *name);
int GetPurchaseByID(DB_T d, const char *id);
int GetPurchaseByName(DB_T d, const char *name);
typedef int (*FUNCPTR_T)(const char* id, const char* name, const int purchase);
int GetSumCustomerPurchase(DB_T d, FUNCPTR_T fp);
What each function does is as follows:
CreateCustomerDB
should allocate memory for a new DB_T
object and any underlying objects.DB_T
object.NULL
.DestroyCustomerDB
should free all memory occupied by the DB_T
object and all the underlying objects.DB_T
object is NULL
, it should do nothing.RegisterCustomer
registers a new item, that is, a new
user whose ID is id
, name is name
, and
purchase amount is purchase
to the DB_T
object d
.
d
. strdup()
. If you use strdup()
, please add -D_GNU_SOURCE
after gcc209
. (e.g., $ gcc209 -D_GNU_SOURCE -o testclient testclient.c
)d
, id
, or name
is NULL
, it is a failure. If purchase
is zero or a negative number, it is a failure.UnregisterCustomerByID
unregisters a user whose ID is id
from the DB_T
object, d
.d
or id
is NULL
, it is a failure. If no such item exists, it is a failure.UnregisterCustomerByName
unregisters a user whose name is name
from the DB_T
object, d
.UnregisterCustomerByID
except that it matches name
instead of id
.
d
or name
is NULL
, it is a failure. If no such item exists, it is a failure.GetPurchaseByID
searches for the purchase amount of the customer whose ID is id
from a DB_T
object d
.d
or id
is NULL
, it is a failure. If there is no customer whose ID matches the given one, it is a failure.GetPurchaseByName
searches for the purchase amount of the customer whose name is name
from a DB_T
object d
.GetPurchaseByID
except that it matches name
instead of id
.
d
or name
is NULL
, it is a failure. If there is no customer whose name matches the given one, it is a failure. GetSumCustomerPurchase
calculates the sum of the numbers returned by fp
by calling fp
for each user item. That is, this function iterates
every user item in d
, calls fp
once for each user item, and returns the sum of the numbers returned by fp
. GetCustomerPurchase
should return the sum of all numbers returned by fp
by iterating each user item in d
.d
or fp
is NULL, it should return -1.fp
is provided by the caller
of GetSumCustomerPurchase
. fp
is a function
pointer that takes user's id, name, and purchase as parameters,
evaluates a specific condition on it, and returns a non-negative
number. For example, the following code snippet shows the example of
the function pointed by fp
, which returns the half of the
purchase amount of the user whose name contains with "Gorilla".int GetHalfAmountOfUserWhoseNameContainsGorilla(const char *id, const char *name, const int purchase) {
if (strstr(name, "Gorilla") != NULL) return (purchase/2); return 0; } ... // call of GetSumCustomerPurchase() with the above function int value = GetSumCustomerPurchase(d, GetHalfAmountOfUserWhoseNameContainsGorilla); printf("Half the purchase amount of the users whose name contains Gorilla is %d\n", value);
customer_manager
Array ImplementationThe goal of the first task is to implement
the
Your first customer_manager
implementation should be as follows:
customer_manager.h
.customer_manager1.c
(skeleton code
).customer_manager
implementation should use a dynamically-allocated memory (i.e.,
in CreateCustomerDB()
and DestroyCustomerDB()
). It should make sure to free all
dynamically-allocated memory when the memory is no longer needed.assert
macro. Determining which invariant should be
maintained is a part of your task. CreateCustomerDB
is
called. You may start with the initial array size as 1024. CreteCustomerDB
is as follows. You can use the
following code or define your own structure differently.
#define UNIT_ARRAY_SIZE 1024 struct UserInfo { char *name; // user name char *id; // user ID int purchase; // purchase amount }; struct DB { struct UserInfo *pArray; // pointer to the array int curArrSize; // current array size (max # of elements) int numItems; // # of stored items, // needed to determine when the array should be expanded // add more records below if needed ... }; ... DB_T CreateCustomerDB(void) { DB_T d; d = (DB_T) calloc(1, sizeof(struct DB)); if (d == NULL) { fprintf(stderr, "Can't allocate a memory for DB_T\n"); return NULL; } d->curArrSize = UNIT_ARRAY_SIZE; // start with 1024 elements d->pArray = (struct UserInfo *)calloc(d->curArrSize, sizeof(struct UserInfo)); if (d->pArray == NULL) { fprintf(stderr, "Can't allocate a memory for array of size %d\n", d->curArrSize); free(d); return NULL; } return d; }
calloc()
to allocate the array is often helpful
since it initializes all elements (and their fields)
to 0
.
RegisterCustomer
, find an empty element in the
array, and store the new user data in it. Make sure that you copy id
and name strings instead of only their pointers. If there is no empty
element, expand the array by calling realloc()
. The
increment amount should be 1024 elements each time you expand.
UnregisterCustomerByID
or UnregisterCustomerByName
, you search for the matching
item, and deallocate the name and id. Optionally, you can set the name
to NULL
. This way, you can easily know which element is
empty by checking each element's name
with NULL
. GetSumCustomerPurchase
, scan the array from
index 0 till the max index, and call fp
for each _valid_
element. customer_manager
Hash Table ImplementationUnfortunately, using an array is slow when you deal with a large number of user items. Frequent reqistration and unregisteration of a user item creates many holes (empty elements) scattered across the array, which, in turn, makes these operations slow. Adding, deleting, and searching of a user item would eventually depend on linear search (unless you take extra measures to manage the holes separately).
We improve the performance of customer_manager
operations
with a hash table in this task. Actually, you would need two hash
tables. One is for looking up a user item with ID as a key, and the
other is for a lookup with a name as a key.
Your hash table-based customer_manager
implementation
should:
customer_manager.h
.customer_manager2.c
.malloc
or calloc
in any object, eventually there should be
exactly one call of free
.enum {HASH_MULTIPLIER = 65599}; ... static int hash_function(const char *pcKey, int iBucketCount) /* Return a hash code for pcKey that is between 0 and iBucketCount-1, inclusive. Adapted from the EE209 lecture notes. */ { int i; unsigned int uiHash = 0U; for (i = 0; pcKey[i] != '\0'; i++) uiHash = uiHash * (unsigned int)HASH_MULTIPLIER + (unsigned int)pcKey[i]; return (int)(uiHash % (unsigned int)iBucketCount); }Implementation tips:
readme
file.
customer_manager
implementation
(it uses the hash_function
mentioned above).
hash_function("ygmoon", n)
and hash_function
("ch.hwang128", n)
are 0.hash_function("baesangwook89", n)
is 4.hash_function("Changho Hwang", n)
and hash_function
("Sangwook Bae", n)
are 3.hash_function("YoungGyoun Moon", n)
is 5.
We
provide testclient.c
to test your implementations. It first checks the correctness of your
library functions and measures the performance over various user
items. Note that we may use other programs for grading.
To compile your code, do the following:
// test your array-based implementation $ gcc209 -o testclient1 testclient.c customer_manager1.c $ ./testclient1 // test your hash table-based implementation $ gcc209 -o testclient2 testclient.c customer_manager2.c $ ./testclient2
(Extra credit: 15% ) We will give an extra credit to the student whose implementation is the fastest among all students. We may use our own program to measure the performance.
Develop in your own environment using emacs
to create
source code and gdb
to debug. Make sure to compile
with gcc209
and test your code on lab machine before
submission.
Please follow the steps through Task 1 to Task 3 to complete
the customer_manager
API, and test your libraries.
Create a readme
text file that contains:
customer_manager
using array and the one using hash table
respectively. Please try to provide reasonable explanation for the pros and cons of each implementaion. Use KAIST KLMS to submit your assignments. Your submission should be one gzipped tar file whose name is
YourStudentID_assign3.tar.gz
Your submission need to include the following files:
customer_manager1.c
customer_manager2.c
readme
text fileWe will grade your work on quality from the user's point of view and from the programmer's point of view. To encourage good coding practices, we will deduct points if gcc209
generates warning messages.
From the user's point of view, your module has quality if it behaves as it should.
In part, style is defined by the rules given in The Practice of Programming (Kernighan and Pike), as summarized by the Rules of Programming Style document. These additional rules apply:
Names: You should use a clear and consistent style for variable and function names. One example of such a style is to prefix each variable name with characters that indicate its type. For example, the prefix c
might indicate that the variable is of type char
, i
might indicate int
, pc
might mean char*
, ui
might mean unsigned int
, etc. But it is fine to use another style -- a style which does not include the type of a variable in its name -- as long as the result is a readable program.
Line lengths: Limit line lengths in your source code to 72 characters. Doing so allows us to print your work in two columns, thus saving paper.
Comments: Each source code file should begin with a comment that includes your name, the number of the assignment, and the name of the file.
Comments: Each function should begin with a comment that describes what the function does from the caller's point of view. The function comment should:
Comments: Each structure type definition and each structure field definition should have a comment that describes it.
Comments: The interface of each data structure should contain a comment that describes what an object of that type is. It would be reasonable to place that comment adjacent to the definition of the opaque pointer.