Subject | Serialization? |
---|---|
Author | Matteo Giacomazzi |
Post date | 2002-07-05T15:16:17Z |
Hi all,
as I stated before, I'm writing my graduation thesis.
Part of my thesis requires I realize a kind of "crawler", that is a
"robot" that discover and retrieve documents from a network. This
robot keeps the URL that finds in the documents in a Firebird DB.
Due to netiquette policy, I want that if one thread is downloading a
document from a host, no other thread should download documents from
the same host.
HOST is a separate table, so I can "mark" a row as "downloading"
when assigned to a task and then mark it as "free" when the download
process ends.
Now the problem: let's imagine that two threads request a URL at the
same time.
1. How can I avoid they get the same URL?
2. How can I avoid they get two different URLs lying on the same
HOST?
I guess that the problem should be easily solved if there would be a
kind of "serialization" in DB queries.
I know that a real DBMS should be able to perform queries as if they
were serialized but... how can I explain it to my Firebird?
There should be a trick I cannot see...
Thank you in advance and sorry for the long post!
Kind regards,
--
Matteo
mailto:matteo.giacomazzi@...
ICQ# 24075529
as I stated before, I'm writing my graduation thesis.
Part of my thesis requires I realize a kind of "crawler", that is a
"robot" that discover and retrieve documents from a network. This
robot keeps the URL that finds in the documents in a Firebird DB.
Due to netiquette policy, I want that if one thread is downloading a
document from a host, no other thread should download documents from
the same host.
HOST is a separate table, so I can "mark" a row as "downloading"
when assigned to a task and then mark it as "free" when the download
process ends.
Now the problem: let's imagine that two threads request a URL at the
same time.
1. How can I avoid they get the same URL?
2. How can I avoid they get two different URLs lying on the same
HOST?
I guess that the problem should be easily solved if there would be a
kind of "serialization" in DB queries.
I know that a real DBMS should be able to perform queries as if they
were serialized but... how can I explain it to my Firebird?
There should be a trick I cannot see...
Thank you in advance and sorry for the long post!
Kind regards,
--
Matteo
mailto:matteo.giacomazzi@...
ICQ# 24075529