Posts Tagged ‘Google App Engine’
Free Captchas, Google App Engine and OCR
Captchas are the distorted, almost unreadable strings you have to retype in a web page in order to do a specific action. The purpose of the captcha is to make sure the form has been filled by a human and not a “bot”. Most of the time you will find captchas on login and comment pages.
Wen time arrived to add captchas to my CryptoEditor website I found a service provided by Google called reCaptcha. ReCaptcha takes advantage of the massive number of users using it everyday to digitize printed books. How can that be? It’s very simple!
Instead of showing one combination of letters and numbers to the user, reCaptcha will ask you to resolve two words. One of these is a word reCaptcha already know the digital translation and the other is a word scanned from a printed book that the OCR application has not been able to resolve. Of course you don’t know which is which.
The word reCaptcha already know will be used to allow you or block access to the service and your guess for the other word, the one the OCR was unable to resolve, will be added to a database to train the OCR reading this book.
The integration of reCaptcha in a Google App Engine is quite straightforward. You first download the reCaptcha python API. Then you copy the captcha.py module to your project and adapt it to use the GAE fetchurl module instead of urllib2 to call the reCaptcha server over http.
If you are lazy like I am, you can download an already adapted version of captcha.py from the Joscha Feth blog. This post explains in details how to implement it in your Google App Engine application.
To be able to use the service, you will need to register to the online service. Since it is a Google service, you can use your google login to register your websites to the service. You will be provided with a private and a public key. NEVER publish or exchange your private key.
Visit the reCaptcha wiki site to find information on how to personalize the look and feel of the captcha form using CSS.
Go Dynamic with Google App Engine Cron Jobs
Using shared hosting services still is the most affordable solution for small and medium business projects. However, one of the features you often have to let go using shared hosting is server side autonomous processing.
Imagine you want to automatically send an email to all users with expired account or you want to clean a server cache once a day to save storage space, you will hit a wall using standard web hosting services. To get this kind of service you usually need to buy a dedicated hosting plan for several hundreds of dollars a month.
An other solution would be to buy a Linux Slice hosting plan but, then, you are on your own to secure and manage your server. This is a speciality not many programmers and web developers are familiar with. It takes a lot of your precious time just to keep the server up.
So, what would you think of a free and limitless solution to your server side processing needs? This is what Google App Engine offers through their Cron jobs implementation. You have full control over the scheduling of the cron jobs using the cron.yaml configuration file.
A Google App Engine cron job can call any url of your appstore application. From there, you can write your Python code to query your website using the URL fetch service. You cal also send emails using the Mail service.
You can do all that while keeping your actual website where is is. If you ever consider moving your website to an other hosting provider, your Google App Engine cron job will continue to work with no modifications.
Following is an example of a simple cron job definition.
Your comments are welcome,
Philippe Chrétien
app.yaml
application: yourappname
version: 1
runtime: python
api_version: 1
handlers:
- url: /.*
script: main.py
cron.yaml
cron:
- description: daily mailing job
url: /mailjob
schedule: every 24 hours
main.py
#!/usr/bin/env python
import cgi
from google.appengine.ext import webapp
from google.appengine.api import mail
from google.appengine.api import urlfetch
class MailJob(webapp.RequestHandler):
def get(self):
# Call your website using URL Fetch service ...
url = "http://www.yoursite.com/page_or_service"
result = urlfetch.fetch(url)
if result.status_code == 200:
doSomethingWithResult(result.content)
# Send emails using Mail service ...
mail.send_mail(sender="admin@gmail.com",
to="someone@gmail.com",
subject="Your account on YourSite.com has expired",
body="Bla bla bla ...")
return
application = webapp.WSGIApplication([
('/mailjob', MailJob)], debug=True)
def main():
wsgiref.handlers.CGIHandler().run(application)
if __name__ == '__main__':
main()
Technology Cost
When I start a new web project, I am more excited by the new technology challenges I have to face than by the financial and legal aspect of the project. However, these aspects of the project gave me a lot of headaches in the last few months.
Two years ago, excited by my new idea, I decided to start working using the technologies I was comfortable with at the time, C#, Web Services and MSSQL. My project require both a web site and a downloadable client application.
When the time came to publish my first version of the application online I started to search for a hosting service. That was my first deception. I found that Windows/MSSQL hosting plans are about twice the price of the hosting plans based on open source technologies for much much less storage and bandwidth. This difference is due to the price of Windows and MSSQL licenses the hosting companies are paying to Bill. Since I had done my entire project with these technologies I decided to pay the extra fees and picked a web hosting with good reputation.
The publishing process went very well and I have been able to quickly start performance tests. That was yet another deception. The performances were from poor to bad. It was not serious to go online with such poor performances. After a chat with the hosting support I learned that, in addition of the higher prices, Windows/MSSQL hosting plans had a higher users/severs ratio than the Linux plans. This, again, was because of the Windows server license costs.
In order to keep going with my project I decided to release a first version with this hosting plan and to move to a dedicated server hosting plan when the traffic would become more important. I then started looking for a content management service to build my web site. There are not many choices of CMS for Windows/MSSQL platform. There are some hosting companies offering Joomla or Drupal on Windows platforms but this solution is far more complicated to maintain on Windows than it is on Linux. I decided to go with the only serious and affordable application, DotNetNuke (DNN).
Guess what! DNN back end is based on MSSQL database server. I can ear you “That’s not a problem, you are on a Windows/MSSQL hosting plan”. You are right but here is the catch … most serious DNN hosting are allowing only one MSSQL database per site with verry limited storage. So where should I store my application database?
I found a DNN hosting, PowerDNN, that, in addition to the MSSQL database was offering a MySQL database. I then decided to get rid of MSSQL constraints by developing a new database connector for MySQL. I was not dependant on MSSQL anymore for the choice of my hosting service. Moving to MySQL was a great improvement. With PowerDNN I had 1Gb of database storage in one MSSQL and one MySQL databases. On the other hand, with a hosting plan from Siteground, I had 750Gb of storage with an unlimited number of MySQL databases.
But I was still bounded to Microsoft technologies because of the web service layer I had developed using C# and the .Net framework. To take advantage of these cheap and faster hosting plans I had to get rid of this layer as well. I tried different techniques … PHP, Python CGIs, PERL with no success. The difficulty was to find a good framework that would allow great scalability.
I then start playing with Google App Engine (GAE). Google App Engine main feature it that your application is hosted in the Google Application Cloud. If implemented correctly, your application can scale from a few users to millions of users with no hardware or software modifications.Best of all, GAE if free under a certain quota of storage, bandwidth and CPU usage. That’s what I was looking for. GAE is based on open source technologies and supports Python and Java languages. Open source projects are hard to kill and I am confident that the community will support the Google initiative to make GAE a strong and affordable hosting solution.
That’s where I am today. I have a .Net client application using a Python based server hosted in the Google App Engine cloud. I take advantage of the Bigtables and other services of the GAE to allow a maximum of scability to my application.
Lately, I started doing some tests with Mono for the client application with very good results. There are some user interface problems but it will be quite easy to make the application work well on both Windows ans Linux. Having a Linux version of my application will give it a great exposure in the world of technology early adopters.
Numbers speak by themselve. I started working on a plateform that costs me more than 4000$ dollars only in licenses for my development environment at home. Then I had to pay a 50$ per month DNN hosting to go online with my project. That gave me 1Gb of storage with no solution for the future when, I hope, the traffic will raise to thousand of users.
Now the application works both on Windows and Linux with a Python server hosted on the Google Application Cloud. The development licenses cost dropped to 0$ and hosting is free until the site kick out!
Next time you start a new project, dont waste time and money as I did. I could have put this new application online a lot faster and for free if I had made the correct decisions on day one of the project.
Have any experience you want to share? I would like to ear from you. Write a comment or send me an email, I’ll be more than happy to discuss that topic with you folks.
Philippe Chrétien
basbrun.appspot.com
I have updated my Google App Engine base website. You can try it at: basbrun.appspot.com and get the code at: http://github.com/pchretien/gaebase. This version is a merge with the developpement done on secretquiz.com.
The site offers a fully functionnal user management system with a simple contact form. The code is distributed under GPL so feel free to use it in your project.
Philippe Chrétien
Google App Engine Email Service
Help improving Google App Engine, go to the Google App Engine Issues page and vote for the issue 677. This is a big no go for many GAE developpers using the Mail service:
http://code.google.com/p/googleappengine/issues/list
Take a few minutes to read more about the needs of the Google App Engine community.
Philippe Chrétien
SecretQUIZ.com Alpha
I have released a first version of SecretQUIZ.com. This release is in its Alpha stage so it may change considerably in the next few days.
SecretQUIZ allows you to securely share information with your friends with no need of complicated encryption applications. SecretQUIZ is based on the authentification principle. You write a small quiz with questions you know only your friend can answer. Once all questions have been correctly answered, the secret message is delivered.
SecretQUIZ.com has been developed using Google App Engine and Python. This should allow scalability in the Google cloud without having to change my code or to buy new hardware.
I release this application free of any charges. I reserve myself the right to charge for some parts of the service in the future if the adds model can not cover maintenance fees. If you read between the lines, this means “click the adds please”!
Feel free to report bugs and to correct my approximate english here on this blog. I’ll follow up on all your comments.
Visit us at www.secretquiz.com
Thank you,
Philippe Chrétien
Google App Engine 1.2.1
Only two weeks after the release of 1.2.0, Google has released version 1.2.1 of its Google App Engine SDK. This new release includes two great new features, the PyCrypto library and the DataStore Remote API.
PyCrypto now provides strong and fast encryption functionalities. With previous versions, strong encryption such as DES/3DES were not available in the framework. There was some pure Python implementations of DES but that was a lot slower and cost a lot of CPU time.
The other great addition in 1.2.1 is the DataStore Remote API. This new API allows you to write client side application that uses the distributed DataStore. Using this API you can run maintenance and diagnostic code on a remote machine. The bandwidth used is calculated in your in and out bandwidth quota.
More informations on this new release at:
http://code.google.com/p/googleappengine/wiki/SdkReleaseNotes
Philippe Chrétien
Google App Engine
Le “Cloud Computing” gagne chaque jour en popularité. Il est maintenant possible pour une jeune entreprise de développer un site web qui accueil des millions d’usagers et ce à très faible coûts. Google est un leader dans ce domaine avec Amazon et quelques autres compagnies.
Google App Engine (GAE) permet à un développeur de construire une application web dans le cloud de Google et ainsi bénificier d’une capacité de croissance illimitée. Pour y arriver, GAE mets à la disposition des programmeurs une suite d’outils de base qui permettent, si ils sont bien utilisés, de profiter pleinement des possibilitées de croissances du cloud.
Google App Engine est disponible pour Python et, depuis quelques semaines, en Java. Originalement, GAE était disponible seulement pour Python, je recommande donc d’utiliser ce langage plutôt que Java. La communauté Open Source/Python est beaucoup plus dynamique que la très “corporative” communauté Java.
Assez de mise en contexte, venons en aux faits. L’approche proposée par GAE est différente de l’approche traditionnelle du cloud computing où on offre généralement à l’usager la possibilité d’ajouter un nombre illimité de machines virtuelles pour supporter leurs besoins de croissance. GAE va plus loin en proposant de nouveaux outils de programmation qui sont conçus pour profiter pleinement du cloud de Google. Ces outils sont:
- Le DataStore qui remplace les bases de données relationnelles traditionnellement utilisées dans les grands sites
- Le Memory Cache qui permet de garder des données en mémoire et de les rendre disponible à toute l’application
- Le Mail Service qui vous permet d’envoyer des courriels aux usagers.
- Le URL Fetch qui permet d’accéder au contenu de sites externes en GET ou en POST.
- Le Image Service sert à manipuler et modifier des images.
- Le Google Account qui vous permet d’utiliser les comptes Google pour vos usagers.
J’ai commencé à travailler avec GAE au début du mois d’avril. J’ai principalement utilisé le DataStore, Le MemoryCache et le Mail Service. Ces outils sont très simples d’utilisation et, si bien utilisés, vous permettent de développer une application qui pourra croitre sans limite.
Je vous recommande ce petit vidéo pour avoir une idée du cycle de développement d’une application avec GAE:
http://www.youtube.com/watch?v=bfgO-LXGpTM
J’ai aussi développé une petite application web qui permet de faire la gestion d’usagers sans utiliser le Google User Account service. Cette application utilise Django comme web framework. Vous pouvez télécharger le code sur mon Github à l’adresse suivante:
http://github.com/pchretien/gaebase/tree
Vos commentaires sont les bienvenus,
Philippe Chrétien
