Last week, I embarked on an adventure into the world of web application programming. Since my work place uses Python as much as possible and my boss likes TurboGears, I chose it for this endeavor. I have worked through various TurboGears tutorials and thought it looked pretty cool. However, it doesn’t take long to discover that there is a lot of undocumented functionality. In this case, I needed a web application that could access multiple databases. I knew SqlAlchemy could do it and since TG2 uses SqlAlchemy, I figured it would too. In this article you will get to travel down the rabbit hole with me as I explain how I figured it out.
When you go looking for help, the first article Google is likely to return is Mark Ramm’s blog post about how easy it is to do. However, Ramm does not actually explain how to do it. Another fellow on the TurboGears Google Group posted a link to the Pylons way of setting up multiple databases. If you go there, you’ll find out that the first step is set up multiple SqlAlchemy URLs in your config file, which in the case of TurboGears is your development.ini file. All you need to do is add one or more SqlAlchemy URLs to the [app:main] section.
So, instead of this:
sqlalchemy.url = sqlite:///%(here)s/devdata.db
You would do something like this:
sqlalchemy.first.url=mssql://user:password@ntsql.servername.com/database
sqlalchemy.second.url=sqlite:///%(here)s/devdata.db
Notice that you add to the dot notation to make the URLs unique. I think the “sqlalchemy” and “url” parts are required at the beginning and the end, but the rest can be whatever you want. If that was all you needed to do, this would indeed be an easy setup. However, we have a couple more files to modify. The next part was pretty tricky. I discovered that there’s some comments on multiple databases in the model folder’s __init__.py file. It claims that you need to create a new MetaData object, which is true, but the commented out example is misleading. In the example, the metadata is not bound to an engine object. Without that info, the metadata object will be basically useless. I then noticed the init_model method which is there for reflection purposes. Upon further digging, I found that you needed to modify it to pass in the engines needed for your various databases, This is where I ended up putting my new metadata object. Technically, you could create it before the method and then just make it global, but I didn’t need that in my test case. Change yours as needed. Following are the changes I needed to make:
# Global session manager: DBSession() returns the Thread-local # session object appropriate for the current web request. maker = sessionmaker(autoflush=True, autocommit=False, extension=ZopeTransactionExtension()) DBSession = scoped_session(maker) maker2 = sessionmaker(autoflush=True, autocommit=False, extension=ZopeTransactionExtension()) secondSession = scoped_session(maker2) def init_model(engineOne, engineTwo): """Call me before using any of the tables or classes in the model.""" DBSession.configure(bind=engineOne) secondSession.configure(bind=engineTwo) # you only need this metadata # if you want to autoload a table second_metadata = MetaData(engineTwo)
The last file that should be edited is the “app_cfg.py” located in the config folder. Here you need to subclass the “AppConfig” object to override the “setup_sqlalchemy” method. This is required only if you have modified your init_model method (above) to accept multiple engines. If you do not do this, you’ll receive a lovely traceback about your method needing additional parameters. Here’s my code:
from pylons import config as pylons_config from tg.configuration import config class MyAppConfig(AppConfig): def setup_sqlalchemy(self): """Setup SQLAlchemy database engine.""" from sqlalchemy import engine_from_config engineOne = engine_from_config(pylons_config, 'sqlalchemy.first.') engineTwo = engine_from_config(pylons_config, 'sqlalchemy.second.') config['pylons.app_globals'].engineOne = engineOne config['pylons.app_globals'].sa_engine = engineTwo # Pass the engine to initmodel, to be able to introspect tables init_model(engineOne, engineTwo) base_config = MyAppConfig()
It should be noted here that the default websetup.py file is hardwired to use a variable called “sa_engine” when you’re using authentication. Thus, I assign one of the engines to that variable above. You could also go into websetup.py and just edit it as needed to accept whatever customizations are required. I do not know if there are additional files that need to be modified as this I ended up just changing the variable name above rather than mess with additional issues like this one.
Once all that is done, you should be good to go. On the TurboGears IRC, one of the members there mentioned a way to use multiple databases by “calling setup twice”. He didn’t explain how this worked, so I don’t know if it is simpler, but if you want to investigate this method and report back, that’s fine by me. I went ahead and created some dummy files with my changes so you can see then in context. They are downloadable below:
Nice post thank you.
I think this can be improved in two ways
1- I believe you can use just one “maker” instance.
2- setup_sqlalchemy could be made to iterate over the keys.
hit me up on IRC, if you can come on Saturday for the bugfix day we can get this in.
Regards elpargo.
Is it possible to load databases whose name will be determined during runtime, i.e. without having to add their names to the config file?
Hah! Nice to know I’m not alone. I’m trying to choose between Django 1.0 and TG2.
Similar problem. We want to make user web pages to access our legacy DBs.
Django has a hack for multiple Dbs. Similar in type to what you do here.
The only thing that is making me choose Django (at least for now) is that I know the DEV branch is working on it. In fact, they hope to finish it up this summer with GSOC. Which means it could be rolled into Trunk in January.
So, I “know” its coming. Whereas, with TG2 it seems to be more of a question mark.
Thanks for the post!
Hah! Nice to know I’m not alone. I’m trying to choose between Django 1.0 and TG2.
Similar problem. We want to make user web pages to access our legacy DBs.
Django has a hack for multiple Dbs. Similar in type to what you do here.
The only thing that is making me choose Django (at least for now) is that I know the DEV branch is working on it. In fact, they hope to finish it up this summer with GSOC. Which means it could be rolled into Trunk in January.
So, I “know” its coming. Whereas, with TG2 it seems to be more of a question mark.
Thanks for the post!
This is very good information – thanks for sharing.
I’m going to add some of this to my application in turbogears, to replace an old way I was connecting to multiple legacy databases.
I may even try elpargo’s suggestion to iterate over keys to handle the case where I may add an additional database as needed.
Cheers, Laurin
This is very good information – thanks for sharing.
I’m going to add some of this to my application in turbogears, to replace an old way I was connecting to multiple legacy databases.
I may even try elpargo’s suggestion to iterate over keys to handle the case where I may add an additional database as needed.
Cheers, Laurin
@Richard actually that’s wrong. This is not a hack TG2 is build to support multiple DBs, sqlalchemy has done so pretty much since day one. That is totally different from the “django ORM”
The subclass is needed for convenience and it is indeed coming. I know for sure this will land into 2.1 and I’m part of the core team.