(2022.02.17) Extend File Descriptor 구현 (2/3)

Notice

Recent Posts

Recent Comments

Link

« 2026/05 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Tags more

Archives

Today

Total

관리 메뉴

거북이의 쉼터

(2022.02.17) Extend File Descriptor 구현 (2/3) 본문

코딩 삽질/KAIST PINTOS (CS330)

(2022.02.17) Extend File Descriptor 구현 (2/3)

onlim 2022. 2. 17. 15:56

1. 서론 및 필요 내용 설명

이번 포스팅에서는 가이드라인에서 서술한 대로 dup2를 구현하기 위해 필요한 구조체를 설계하고 해당 구조체를 활용한 dup2를 구현하겠다. 또한, 기존 함수에서 해당 구조체와 상호작용이 필요하다면 같이 수정하도록 한다.

2. TODO

새로운 구조체 설계
dup2 함수의 구현
기존 함수의 수정

3. 구현 과정

지난번 그림은 정말 날것이었기에 보완이 필요하다. 일단 리스트 하나만 들있는 구조체는 만들 필요가 없다. 그냥 리스트의 포인터를 fildes 구조체에 넣어두고 사용하기로 한다. 공유되고 있는 fd의 리스트여야 일괄적으로 관리할 수 있기 때문에 리스트 자체는 어느 한 구조체에 귀속된 형태가 아닌 포인터로 가리키고 있는 형태여야 한다. 따라서 다음과 같이 dup2_union 리스트 포인터를 추가한다.

/* An open file that is kept in thread */
struct fildes {
	int fd; // file descriptor
	struct list *dup2_union;
	struct file *fp;
	struct list_elem fd_elem;
};

이와 반대로 리스트 자체의 엔트리로서의 구조체는 필요하다. 이놈의 pintos는 int 하나라도 리스트로서 활용을 하려면 list_elem 하고 같이 써야 하기 때문이다. 따라서 다음과 같이 dup2_union이 가리키는 리스트의 엔트리로 들어갈 구조체를 설계한다.

struct du_entry {
	int fd;
	struct list_elem du_elem;
};

예전에 쓰고 버렸던 free_fd_list와 같은 구조다.

이제 dup2 부터 구현해보자. 이전 포스팅에서 설명한 것대로 논리 흐름을 따라 구현하면 된다. 다만 이를 위해 새로운 비교 함수를 구현해 사용한다.

bool
cmp_fd_du_entry (const struct list_elem *a, const struct list_elem *b, void *aux) {
	struct du_entry *d_a = list_entry(a, struct du_entry, du_elem);
	struct du_entry *d_b = list_entry(b, struct du_entry, du_elem);

	return d_a->fd < d_b->fd;
}

아래는 dup2의 미완성 코드이다. 예외처리 코드부분 때문에 생각보다 길어졌다.

int _dup2 (int oldfd, int newfd)
{
	int res = -1;
	bool succ = false;

	struct list *union_ptr;
	struct fildes *oldfd_entry = NULL, *newfd_entry = NULL, *tmp;
	struct du_entry *oldfd_due, *newfd_due;

	filesys_enter();
	// check if oldfd is valid fd
	oldfd_entry = fd_search(oldfd);
	if (oldfd_entry == NULL)
		goto dup2_done;

	// oldfd is valid, check if oldfd == newfd (do nothing and return newfd)
	if (oldfd == newfd)
	{
		succ = true;
		res = newfd;
		goto dup2_done;
	}

	// check if newfd is non-negative.
	if (newfd < 0)
		goto dup2_done;

	// if newfd is already open, close it
	// and remove from fd_list
	tmp = fd_search(newfd);
	if (tmp != NULL)
	{
		// todo
	}

	newfd_entry = (struct fildes *) calloc(1, sizeof(struct fildes));
	if (newfd_entry == NULL)
		goto dup2_done;

	// check whether oldfd is part of dup2_union
	if (oldfd_entry->dup2_union == NULL)
	{
		union_ptr = malloc(sizeof(struct list));
		if (union_ptr == NULL)
			goto dup2_done;
		list_init(union_ptr);
		
		oldfd_due = (struct du_entry *) malloc(sizeof(struct du_entry));
		if (oldfd_due == NULL)
		{
			free(union_ptr);
			goto dup2_done;
		}
		oldfd_due->fd = oldfd;

		newfd_due = (struct du_entry *) malloc(sizeof(struct du_entry));
		if (newfd_due == NULL)
		{
			free(union_ptr);
			free(oldfd_due);
			goto dup2_done;
		}
		newfd_due->fd = newfd;

		list_insert_ordered(union_ptr, &oldfd_due->du_elem, cmp_fd_du_entry, NULL);
		list_insert_ordered(union_ptr, &newfd_due->du_elem, cmp_fd_du_entry, NULL);

		oldfd_entry->dup2_union = union_ptr;
	}

	else
	{
		union_ptr = oldfd_entry->dup2_union;
		newfd_due = (struct du_entry *) malloc(sizeof(struct du_entry));
		if (newfd_due == NULL)
			goto dup2_done;
		newfd_due->fd = newfd;
		list_insert_ordered(union_ptr, &oldfd_due->du_elem, cmp_fd_du_entry, NULL);
	}

	newfd_entry->fd = newfd;
	newfd_entry->dup2_union = union_ptr;
	newfd_entry->fp = oldfd_entry->fp; // get the exact same pointer
	list_insert_ordered(&thread_current()->fd_list, &newfd_entry->fd_elem, cmp_fd, NULL);
	
	succ = true;
	res = newfd;

dup2_done:
	filesys_exit();

	if (!succ)
	{
		res = -1;
		if (newfd_entry != NULL)
			free(newfd_entry);
	}

	return res;
}

이제 채워넣어야 할 부분은 newfd가 존재할 때 해당 파일을 닫는 부분이다. 해당 부분은 앞으로 수정해야 할 close와 유사하게 채워넣어질 것 같아 아래에서 일괄적으로 다루기로 하자.

주요하게 남은 것은 close와 exit, fork에서의 과정이다. 다 옆으로 치워두고 우리의 주적인 fork 부터 생각해보자. 지금 현재 fork의 구현대로라면 기존 프로세스에서 같은 file 포인터를 가지고 있는 fildes 구조체라도 개별적으로 file_duplicate가 적용되기 때문에 자식 프로세스에서는 같은 file 포인터를 가지고 있지 않게 된다. 때문에 복사 과정을 뜯어 고쳐야 한다. 이를 위해 여러 방면으로 고민을 해 보았다. 여기서 못 잡으면 hash와 신나는 시간을 보내야 한다.

고민한 결과 dup2_union에 한 가지 기능이 더 있으면 한다. 바로 접근했는지의 기록 여부이다. 부모 프로세스의 fd_list를 선형 순회하면서, dup2_union가 NULL인 경우는 단순 복사하면 되지만, dup2_union이 NULL이 아닐 경우 해당 리스트에 들어있는 fd의 수만큼 새로운 fd_list에 할당해주어야 한다. 문제는 계속 순회를 하면서 다시 같은 dup2_union을 만날 경우 같은 fd가 중복하여 들어간다는 것이다. 따라서, dup2_union에 fork의 복사 루틴이 접근했다는 흔적을 남길 수 있다면 해당 fd는 건너뛰고 다음 fd로 넘어갈 수 있다. 넣어줄 때는 list_push_back으로 일괄적으로 넣고 마지막에 정렬을 한다면 O(nlogn) 시간만에 복사를 끝낼 수 있다. 이래도 multi-oom 강화버젼을 통과할지는 의문이지만 이게 list로 하로 수 있는 최선이다.

결국 이를 위해서는 dup2_union이 단순 리스트를 가리키는 포인터가 아닌 리스트를 포함한 구조체를 가리키는 포인터여야 한다. ~~어차피 이래나 저래나 만들게 되는구나... ㅋㅋㅋ~~ 그래서 만들어주었다. 기왕 만드는 김에 다른 구조체의 이름도 더 직관적으로 수정하였다.

struct dup2_fd {
	int fd;
	struct list_elem du_elem;
};

struct dup2_union {
	bool accessed;
	struct list dup2_fd_list; // list of struct dup2_fd
};

/* An open file that is kept in thread */
struct fildes {
	int fd; // file descriptor
	struct dup2_union *dup; // short for dup2 union pointer
	struct file *fp;
	struct list_elem fd_elem;
};

_dup2도 해당 구조체에 맞춰 수정한다. 코드는 길이상 가장 끝에 최종 버젼으로 한 번 넣도록 하겠다. 이제 fork를 수정한다. 전체 복사 루틴이 마무리된 이후에는 다시 각 접근 플래그를 초기화하여 여러번 fork될 가능성을 준다.

static void
__do_fork (void *aux) {
	...
	struct list_elem *e, *el;
	struct fildes *fd_entry, *fd_entry_copy;
	struct file *fp = NULL;

	struct dup2_union *union_ptr;
	struct dup2_fd *dup2_fd_entry, *dup2_curfd;
	int cnt = 0;

	filesys_enter();

	/* copy files from parent fd_list */
	for (e = list_begin(&parent->fd_list); e != list_end(&parent->fd_list); e = list_next(e))
	{
		fd_entry = list_entry (e, struct fildes, fd_elem);

		// first check it's dup2 relevant
		if (fd_entry->dup == NULL) // normal file copy
		{
			fd_entry_copy = (struct fildes *)malloc(sizeof(struct fildes));
			if (fd_entry_copy == NULL)
			{
				succ = false;
				goto file_copy_done;
			}

			if (fd_entry->fp != 0 && fd_entry->fp != 1)
			{
				fp = file_duplicate (fd_entry->fp);
				if (fp == NULL)
				{
					free (fd_entry_copy);
					succ = false;
					goto file_copy_done;
				}
			}

			else
				fp = fd_entry->fp;
			
			fd_entry_copy->fd = fd_entry->fd;
			fd_entry_copy->dup = NULL;
			fd_entry_copy->fp = fp;
			list_push_back(&current->fd_list, &fd_entry_copy->fd_elem);
		}

		else
		{
			if (fd_entry->dup->accessed) // already copied. 
				continue;

			fd_entry->dup->accessed = true;
			if (fd_entry->fp != 0 && fd_entry->fp != 1)
			{
				fp = file_duplicate (fd_entry->fp);
				if (fp == NULL)
				{
					succ = false;
					goto file_copy_done;
				}
			}

			else 
				fp = fd_entry->fp; // 0 or 1

			// now copied file is ready  
			
			union_ptr = (struct dup2_union *)malloc(sizeof(struct dup2_union));
			if (union_ptr == NULL)
			{
				if (fp != 0 && fp != 1)
					file_close(fp); // should close the file at this point, else there is no way to access
				succ = false;
				goto file_copy_done;
			}
			list_init(&union_ptr->dup2_fd_list);
			union_ptr->accessed = false;

			cnt = 0;
			
			// we need enough allocation for 1 dup2_union, n dup2_fd, n fildes 
			// for fail-safe and tracking in exit, we allocate one by one
			// if at least 1 entry is in, then we don't close the file -> will be closed in exit 
			for (el = list_begin(&fd_entry->dup->dup2_fd_list); 
				 el != list_end(&fd_entry->dup->dup2_fd_list); el = list_next(el))
			{
				fd_entry_copy = (struct fildes *)malloc(sizeof(struct fildes));
				if (fd_entry_copy == NULL)
				{
					if (cnt == 0)
					{
						if (fp != 0 && fp != 1)
							file_close(fp);
						free(union_ptr);
					}
					succ = false;
					goto file_copy_done;
				}

				dup2_fd_entry = (struct dup2_fd *)malloc(sizeof(struct dup2_fd));
				if (dup2_fd_entry == NULL)
				{
					if (cnt == 0)
					{
						if (fp != 0 && fp != 1)
							file_close(fp);
						free(union_ptr);
					}
					free(fd_entry_copy);
					succ = false;
					goto file_copy_done;
				}

				dup2_curfd = list_entry(el, struct dup2_fd, du_elem);
				
				dup2_fd_entry->fd = dup2_curfd->fd;
				list_push_back(&union_ptr->dup2_fd_list, &dup2_fd_entry->du_elem);

				fd_entry_copy->fd = dup2_curfd->fd;
				fd_entry_copy->dup = union_ptr;
				fd_entry_copy->fp = fp;
				list_push_back(&current->fd_list, &fd_entry_copy->fd_elem);

				cnt++;
			}
		}
	}

	fp = file_duplicate(parent->exec_file);
	if (fp == NULL)
	{
		succ = false;
		goto file_copy_done;
	}
	current->exec_file = fp;

file_copy_done:
	// loop again to recover the access flag to false in parent process
	for (e = list_begin(&parent->fd_list); e != list_end(&parent->fd_list); e = list_next(e))
	{
		fd_entry = list_entry (e, struct fildes, fd_elem);
		if (fd_entry->dup != NULL)
			fd_entry->dup->accessed = false;
	}

	// sort the fd_list in child process
	list_sort (&current->fd_list, cmp_fd, NULL);
	
	filesys_exit();
	...
}

위와 같이 fork를 수정하면 fork에 대한 작업은 끝난다.

이제 exit을 보자. exit도 별개의 문제가 있다. 현재 exit(process_exit)은 fd_list를 순회하면서 해당 시점까지도 열려있는 파일을 순차적으로 닫는다. 그 과정에서 file_close를 각 엔트리의 fp에 하게 되는데 이는 dup2가 적용되면 문제가 된다. 같은 fp를 가지게 되는 현재 구현 구조상 file_close가 같은 fp에 대해서 여러번 실행될 수 있으며, 중복하여 free를 하는 것은 분명 문제를 초래하게 된다.

따라서 마지막 순간에 파일을 닫을 때 동일한 fp에 대하여 중복되지 않게 file_close를 호출해 자원을 해제하도록 코드를 수정해야 한다. 이는 fork를 위해 추가했던 accessed 플래그를 활용해 file_close를 했음을 표시해줄 수 있다. 이미 닫은 fp에 대해서는 그냥 무시하고 넘어가면 되는 것이다.

fp는 중복해제 문제를 피했다. 이제 남은 자원 해제 시점을 생각해보자. fildes 구조체에서 dup 포인터가 NULL일 경우 fp는 단순 해제하면 된다. dup 포인터가 NULL이 아닐 경우, dup 포인터가 가리키는 dup2_union 구조체를 위해 할당한 공간과, 내부의 dup2_fd_list에 들어갈 인자를 위해 할당한 공간을 해제해야 한다. 이를 위한 해제 순서는:

현재 순회하고 있는 fildes의 fd에 해당하는 dup2_fd 구조체를 dup->dup2_fd_list에서 꺼내 해제한다.
만약 dup2_fd_list가 비게 된다면 해당 dup2_union을 사용할 fd는 더 이상 남아있지 않다는 뜻이다. 따라서 해당 시점에 dup을 해제한다.

이다. 지금까지 설계한 내용을 바탕으로 process_exit의 파일 정리 부분을 수정하면 아래와 같다.

void
process_exit (void) {
	...
	filesys_enter ();

	while (!list_empty (&curr->fd_list))
	{
		e = list_pop_front (&curr->fd_list);
		struct fildes *fd_entry = list_entry (e, struct fildes, fd_elem);

		if (fd_entry->dup != NULL)
		{
			if (!fd_entry->dup->accessed)
			{
				if (fd_entry->fp != 0 && fd_entry->fp != 1)
					file_close (fd_entry->fp);
				fd_entry->dup->accessed = true;
			}

			// get rid of the fd from fd_entry->dup->dup2_fd_list
			for (el = list_begin(&fd_entry->dup->dup2_fd_list);
				 el != list_end(&fd_entry->dup->dup2_fd_list); el = list_next(el))
			{
				dup2_fd_entry = list_entry(el, struct dup2_fd, du_elem);
				if (dup2_fd_entry->fd == fd_entry->fd)
					break;
			}
			
			list_remove (el);
			free (dup2_fd_entry);

			// check whether the list is empty
			if (list_empty(&fd_entry->dup->dup2_fd_list))
				free(fd_entry->dup);
		}

		else
		{	
			if (fd_entry->fp != 0 && fd_entry->fp != 1)
				file_close (fd_entry->fp);
		}

		free (fd_entry);
	}

	filesys_exit ();
	...
}

슬슬 징글징글하다. 이제 close를 보자.

이전 포스팅까지 계속 염두하던 부분은 한 fd에 대해서 close를 할 때 연동된 나머지 fd(들)에 대해서는 영향이 없어야 한다는 것이었다. 이전 포스팅에서 제시한 방법과 유사한 방식으로 설계한다.

close를 할 때 닫으려고 하는 fildes의 dup->dup2_fd_list에서 해당 fd를 제외하고 관련 자원도 해제한다.
제외한 뒤, dup2_fd_list에 원소가 하나가 남았다면 남은 fd(left_fd)도 꺼내서 해제하고 dup 관련 자원을 모두 해제한다. left_fd를 가진 fildes의 dup을 NULL로 설정한다.
file_close를 호출하지 않고 fildes 구조체의 자원만 해제한 채로 종료한다.

생각해보니 굳이 복사본을 만들지 않고 최후의 경우, 그러니까 dup 포인터가 NULL인 경우에만 file_close를 호출하면 된다는 사실을 알게되었다. 이제 설계대로 구현한다.

void _close (int fd)
{
	filesys_enter();

	struct list_elem *e;
	struct dup2_fd *dup2_closefd, *dup2_leftfd;
	int left_fd;

	struct fildes *fd_entry = fd_search(fd), *left_fd_entry;
	struct file *fp;

	if (fd_entry == NULL) // not valid fd
		goto close_done;

	list_remove (&fd_entry->fd_elem);

	if (fd_entry->dup == NULL)
	{	
		fp = fd_entry->fp;
		if (fp != 0 && fp != 1)
			file_close (fp);
	}

	else
	{
		struct list *dpfd_list = &fd_entry->dup->dup2_fd_list;
		for (e = list_begin(dpfd_list); e != list_end(dpfd_list); e = list_next(e))
		{
			dup2_closefd = list_entry(e, struct dup2_fd, du_elem);
			if (dup2_closefd->fd == fd)
				break;
		}

		list_remove(e);
		free(dup2_closefd);

		if (list_size(dpfd_list) == 1)
		{
			e = list_pop_front(dpfd_list);
			dup2_leftfd = list_entry(e, struct dup2_fd, du_elem);
			left_fd = dup2_leftfd->fd;
			
			// search for left_fd fildes in fd_list
			left_fd_entry = fd_search(left_fd);
			left_fd_entry->dup = NULL;

			free(dup2_leftfd);
			free(fd_entry->dup); // same as previous left_fd_entry->dup
		}
	}

	free (fd_entry); // always free what you malloc-ed;

close_done:
	filesys_exit();
}

dup2에도 비슷한 코드를 넣는다. 최종적으로 완성된 dup2는 아래와 같다.

int _dup2 (int oldfd, int newfd)
{
	int res = -1;
	bool succ = false;

	struct dup2_union *union_ptr;
	struct fildes *oldfd_entry = NULL, *newfd_entry = NULL;
	struct fildes *tmp_entry, *left_fd_entry;
	struct dup2_fd *dup2_oldfd, *dup2_newfd;
	struct dup2_fd *dup2_tmpfd, *dup2_leftfd;

	struct list_elem *e;
	struct file *fp;
	int left_fd;
	struct thread *curr = thread_current();

	filesys_enter();
	// check if oldfd is valid fd
	oldfd_entry = fd_search(oldfd);
	if (oldfd_entry == NULL)
		goto dup2_done;

	// oldfd is valid, check if oldfd == newfd (do nothing and return newfd)
	if (oldfd == newfd)
	{
		succ = true;
		res = newfd;
		goto dup2_done;
	}

	// check if newfd is non-negative.
	if (newfd < 0)
		goto dup2_done;

	// if newfd is already open, close it
	// and remove from fd_list
	tmp_entry = fd_search(newfd);
	if (tmp_entry != NULL)
	{
		list_remove (&tmp_entry->fd_elem);

		if (tmp_entry->dup == NULL) // normal close process
		{
			fp = tmp_entry->fp;
			if (fp != 0 && fp != 1)
				file_close (fp);
		}

		else // remove the relation of dup2
		{
			struct list *dpfd_list = &tmp_entry->dup->dup2_fd_list;
			for (e = list_begin(dpfd_list); e != list_end(dpfd_list); e = list_next(e))
			{
				dup2_tmpfd = list_entry(e, struct dup2_fd, du_elem);
				if (dup2_tmpfd->fd == newfd)
					break;
			}

			list_remove(e);
			free(dup2_tmpfd);

			if (list_size(dpfd_list) == 1)
			{
				e = list_pop_front(dpfd_list);
				dup2_leftfd = list_entry(e, struct dup2_fd, du_elem);
				left_fd = dup2_leftfd->fd;
				
				left_fd_entry = fd_search(left_fd);
				left_fd_entry->dup = NULL;

				free(dup2_leftfd);
				free(tmp_entry->dup);
			}
		}

		free (tmp_entry);
	}

	newfd_entry = (struct fildes *) calloc(1, sizeof(struct fildes));
	if (newfd_entry == NULL)
		goto dup2_done;

	// check whether oldfd is part of dup2_union
	if (oldfd_entry->dup == NULL)
	{
		union_ptr = malloc(sizeof(struct dup2_union));
		if (union_ptr == NULL)
			goto dup2_done;
		list_init(&union_ptr->dup2_fd_list);
		union_ptr->accessed = false;
		
		dup2_oldfd = (struct dup2_fd *) malloc(sizeof(struct dup2_fd));
		if (dup2_oldfd  == NULL)
		{
			free(union_ptr);
			goto dup2_done;
		}
		dup2_oldfd->fd = oldfd;

		dup2_newfd = (struct dup2_fd *) malloc(sizeof(struct dup2_fd));
		if (dup2_newfd == NULL)
		{
			free(union_ptr);
			free(dup2_oldfd);
			goto dup2_done;
		}
		dup2_newfd->fd = newfd;

		list_insert_ordered(&union_ptr->dup2_fd_list, &dup2_oldfd->du_elem, cmp_dup2_fd, NULL);
		list_insert_ordered(&union_ptr->dup2_fd_list, &dup2_newfd->du_elem, cmp_dup2_fd, NULL);

		oldfd_entry->dup = union_ptr;
	}

	else
	{
		union_ptr = oldfd_entry->dup;
		dup2_newfd = (struct dup2_fd *) malloc(sizeof(struct dup2_fd));
		if (dup2_newfd == NULL)
			goto dup2_done;
		dup2_newfd->fd = newfd;
		list_insert_ordered(&union_ptr->dup2_fd_list, &dup2_newfd->du_elem, cmp_dup2_fd, NULL);
	}

	newfd_entry->fd = newfd;
	newfd_entry->dup = union_ptr;
	newfd_entry->fp = oldfd_entry->fp; // get the exact same pointer
	list_insert_ordered(&curr->fd_list, &newfd_entry->fd_elem, cmp_fd, NULL);
	
	succ = true;
	res = newfd;

dup2_done:
	filesys_exit();

	if (!succ)
	{
		res = -1;
		if (newfd_entry != NULL)
			free(newfd_entry);
	}

	return res;
}

어우 길어....

4. 디버깅

아오 exit fork 이 거지같은 함수들은 왜 계속 손을 봐야 할까. 이제 디버깅 시작....인데 오늘 힘들어서 내일이나 해야지.

5. 후기

꽤 간단할 줄 알았는데 구조체 2개 선언했다고 코드가 뻥튀기가 되는 기적을 보았다. 미친...

다음 포스팅 때는 끝내게 해주세요.

'코딩 삽질 > KAIST PINTOS (CS330)' 카테고리의 다른 글

(2022.02.20) 프로젝트 3 Introduction 맛보기 (1)	2022.02.20
(2022.02.18) Extend File Descriptor 구현 (3/3) (0)	2022.02.18
(2022.02.17) Extend File Descriptor 구현 (1/3) (0)	2022.02.17
(2022.02.16) Extend File Descriptor 가이드라인 (0)	2022.02.16
(2022.02.16) System Calls 구현 (4/4) (0)	2022.02.16

'코딩 삽질/KAIST PINTOS (CS330)' Related Articles

Comments

거북이의 쉼터

(2022.02.17) Extend File Descriptor 구현 (2/3) 본문

(2022.02.17) Extend File Descriptor 구현 (2/3)

1. 서론 및 필요 내용 설명

2. TODO

3. 구현 과정

4. 디버깅

5. 후기

'코딩 삽질 > KAIST PINTOS (CS330)' 카테고리의 다른 글

티스토리툴바