Posts tagged php
The following is a write-up is leaning onto an artictle I read years ago, but this post's essence should still hold up today. It's written being targetet at PHP landscape, and is not something you need for your personal blog. Do use a static website generator if you care for speed there, your wordpress is gonna be hacked some day anyway.
If you already have two servers, one for your webserver, one for your database, this is more likely for you. If you already have a loadbalancer in front of two webservers, this is definitely for you.
But even big webshops usually do not put such measures in place, except if they really do care about their response times and thus about their google ranking.
Mostly this is a guidance on how to tackle the customers favourite complaint: 'It is so slow!' and providing some background.
When trying to fix a 'slow' website, there are several approaches.
- fix the website code
- throw hardware at the problem
- change the underlying infrastructure (software-wise)
First usually is not going to happen, as good web developers, especially in the PHP universe, are just rare. They fight their codebase and are happy when things are working correct. Performance comes second, and profiling their application is something they often never did or heard about.
Second was a nice solution, but these GHz numbers don't really improve that drastically as they did in the past. And since the memory wall gets hit, this solution also ceases to be a viable approach, no matter how fast you make your webserver connect to your database. SSD's do help, but only so much.
Which leaves us with option three, or the following measures in particular:
- SSD's (noted here for sake of completeness)
- handle sessions via redis
- separate static from dynamic content and serve each via different webservers
- browser caching
- accelerators, like squid or varnish in front
- opcode caches
- database caching via memcached to relieve the main database
- CDN's like akamai
If you are about to migrate your website onto new hardware anytime soon (onto a new single server is what we talk about), think about getting SSD's. These have capacities of 250GB upwards, which will give you in a RAID10 setup like 500GB of usable redundantly persisted space. No matter how big your web presence is usually, after substracting 20GB for a linux operating system, this leaves you with plenty of diskspace for whatever your future may hold for you.
For budgeting reasons, a RAID1 setup of two SSD's providing still ~230GB of space is usually sufficient, except you plan on storing literally shitloads of FTP data or useless backups.
Backups are to be done off-site on another server anyway.
You don't need version control on your production server anyway (except for
/etc maybe), except you think you are a true DevOp, fight others to the bone about the agile kool-aid and know jack-shit anyway.
But, no hard feelings, point is, just get SSD's if you can afford them.
This is sort of a pre-requisite, depending your overall approach, see the fazit at the bottom to see what this means.
If you happen to have a lot of sessions, this improves things a bit. Reading sessions from an in-memory database is just plain faster than letting the webserver getting them from the harddisk every time. This is only true if redis runs on the same machine as your webserver, as network latency is almost always higher than the latency of disk I/O operations. If you have a direct crosslink to your dedicated session server with 10G NIC's, this is not true, but if you have this in place you sure as hell do not need this whole article.
If you however split your load onto several webservers behind a loadbalancer, and want a real 'shared-nothing' architecture, things are different. In that case, you don't have your loadbalancer configured to use sticky session, and so you need a central place where your sessions are managed. 'Sticky sessions' simply mean, each user is served by the same webserver each time he visits your website.
Unless you really think you need a shared nothing installation or have REALLY many sessions, you don't need this.
Use redis instead of memcached for this, as the former can persist the in-memory data to disk.
static vs. dynamic content
Classify all your content in one of these two groups. Put all static content on a separate webserver, and let it be served by it only handling a subdomain of your site. This frees up resources on your 'dynamic' webserver. If you put both on the same hardware, if the dynamic webserver eats all the resources, it's no use doing that of course. You need another server in this case.
In case you read this, and have zero clue what static vs. dynamic means:
- static content = html files on your website
- dynamic content = html code generated from your php code which is then inserted in the already existing html code mentioned as 'static'
Classify your content into things that, maybe like this:
- never change (6 months caching time)
- seldom change (1 week)
- often (1 day)
- always (1 minute)
This is just rough guidance from the top of my head, adjust to your needs.
Set the caching headers of your HTTP packets accordingly, and let the users browser help you reduce your servers' load.
To still be able to exchange old content with new one, add hashes to the URLs of your 'never-changes' content, to make sure when things change, new content will be served no matter what cache expiration times you use. These hashes have to be created during your deployment process automatically and be inserted to your application code, also automatically.
This is actually something rather sophisticated, but otherwise you have the same problem as with using 301 Redirects: Caching times and permanent redirects don't forgive fuck-ups on your behalf.
reverse-proxies for accellerating things
If you already seperated dynamic vs. static content, what sense does it make to put an accelerator in form of a reverse-proxy like squid or varnish up front, too?
Accelerators do create like 'static snapshots' of your combined static-dynamic content, and serves them directly. An accelerator does create static html code to be served from the already existing static html parts and the html generated from the interpreted php code.
Some made up numbers for a big website and requests being possibly served per second:
- dynamic content webserver: 100
- static content webserver: 5k
- accelerator: 250k
Important is to differentiate HTTP GET and all the other requests. GET's don't change things, POST's or PUT's and such do. GET are served by the accelerator, but the others must pass all your caching layers.
PHP instructions are parsed and translated into operation codes (machine language) in the process of their execution through the php interpreter. To speed up things, opcode caches like APC do basically precompile the instruction to speed up the php execution.
Like up to three times faster your website can become, just through the opcode cache.
memcached is a in-memory key-value store, caching often-used data from the database. Any questions why this might be beneficial? ;)
Sidenote: There exist memcache and memcached which are separate programs, don't get confused by that.
content delivery networks
In case you have seriously big traffic spikes (read: if you wonder about this happening, you don't), you don't need a CDN.
CDN's are put in place by exchanging the subdomain pointing to the data served by your static webservers, to another subdomain pointing to the CDN. This is helpful if you have had single times like special days where you knew your load to be ridiculously high where you'd need a lot more serving power than you usually do.
If you need your CDN to not just serve static content but complete sites, you can't just use your own loadbalancer. The CDN's loadbalancer must be configured and put to work.
So instead of getting more machines yourself, set things up accordingly and employ a CDN of your choice. Akami is rather good.
Else your machines will idle around for 99,99% of all the time you have them in place, and would have a hard time making profit out of them. Never wondered why amazon is such a huge cloud provider? That's just their machines that would otherwise be doing nothing since christmas is just not there already.
In case you have a single server where a webserver and a database server run on, what are the easiest steps for speeding up things?
- opcode cache
Also the SSD's help, but usually you get them up front, not after your installation is already running, as there are migration fees to be paid if you want your provider to reinstall your hosting.
Further you do categorize your content, and implement caching.
Once all this is done, implement caching.
The next step would be more hardware and distributing the load onto several webservers.
So what if you already have more than one server?
Do the first three points mentioned above and caching.
Then set up dedicated session handling, so your load will be distrubuted more evenly accross your servers, when using a loadbalancer.
For setting everything else up, you should know what you are doing and not just be reading this.
For sqeezing a little bit more performance out of a php page backed with mysql, try:
#max_execution_time = 30 max_execution_time = 60 #memory_limit = 128M memory_limit = 512M
#key_buffer = 16M key_buffer = 32M #query_cache_limit = 8M query_cache_limit = 8M #query_cache_size = 16M query_cache_size = 64M
Commented lines are the default values for comparison.
Migrating a website can be a tedious task, if you have problems keeping several things at once inside your head. This aims to solve this problem by presenting some proper guidelines.
Here we have a standard dynamic website with a mysql backend, served through an apache httpd.
For other databases/webservers the steps may differ in particular, but essentially this is the same theory everytime.
Mailmigration will as of now not be a part of this here, since it's gonna be long enough anyway.
Read this completely prior, as alternative ways are suggested sometimes.
This part is almost the most important, actual copying is usually not that hard if you know what you are doing. It's often harder to remember everything.
Before we start, the server can serve data of three kinds which are handled all the same way.
web data, just copy the website code database, copy the database dump file emails, copy the mailfiles
The server is accessed via the globally available...:
Basically these are the things you have to copy/adjust so things will go smooth.
Putting most of these questions plus the answers to them into a spreadsheed is not the worst idea. Maybe I will come up with a shell one-liner to create a .csv later.
Also it is helpful if you are able to do FXP (transfer files from one host directly to the other, without temporary saving the data/files locally), if you do not have SSH access.
server access via ssh is possible?
ssh works via key? or password only?
root account? (a lot of this guide assumes root privileges, I might have missed points there are no alternatives)
if not, do you have all necessary account credentials for all folders etc.?
DO THESE WORK?
if no ssh, do you have ftp credentials?
do the credentials actually work?
do you get a database dump you can transfer? (If you cannot access the server, you can't make a dump.)
are the folder accurately named?
how BIG is the webfolder? (so how long will copying take?)
which database management system is used? (i.e. mysql or postgres)
database credentials for it are?
what is the database the site is using actually called?
just how BIG is the database? (and so how long will copying take?)
what domains are pointing to the server?
are these actually active?
and can you change the DNS RR?
what are the DNS TTL times?
is mailing configured?
don't forget the DNS MX RR/RR's while at the last point
DNS: aquiring information active resource records
For finding out about the dns, if you have several virtual hosts on the same machine, try grepping them all there.
When having an apache,
grep all vhost files for
Here's a kind-of snippet, which will work if your apache vhost configs are in default locations and indented:
\grep -e '^\s\+Server' /etc/apache2/sites-enabled/*
This shows only active sites, check
sites-available if you have to migrate sites which are currently turned off, too.
The resulting list, if sanitized, can be piped on the shell and used with something like
dig +short, to easily check which domains are still running.
Check all the records, not just the
AAAA (quad-A is ipv4, single-A is ipv4) records, also MX and whatever is set.
If the exit code is non-zero, no dns anymore and less work for you.
Providing a script here would not help much, since you should know what you are doing here anyway and it would most likely not help you much.
and maybe prepare the webserver, too
In case the apache config is, lets say, 'adventurous', do
apache2ctl -S (Debian/Ubuntu) or
httpd -S to see which domains are hosted, and in which file these are defined.
Then search there for
If the webserver happens to have all vhosts defined in one huge file (which ist just... very not great), remove the configuration and place them into a separated file.
In Debian-based Linuces you can use
a2ensite <vhost-config-filename> /
a2dissite <vhost-config-filename> to enable/disable single websites easily.
On Redhat-based ones you create the symlinks to the configfolder apache is configured to load manually and delete them also by hand. (This isn't any different from what
All this only for the sites you want to migrate.
Of course, you can just comment out the information on your vhosts from the config, but just... don't.
For other webservers all this is different, of course, but you get the idea.
DNS: get the domains and the website together, information-wise
Refer to the website via its main link.
ServerName from above.)
But make sure to note all other aliases there, too.
ServerAlias from above.)
Since you can only migrate one site after another, this helps to keep track.
Write all this down, each alias in another row.
Maybe put the inactive ones into an extra column there, too.
Could be that these should be prolonged again, or were incorrectly set.
(I.e. it did not point to the webserver when you checked.)
Write the set TTL into the next column, along with the current date. (Usually TTL is 86400, which means 24 hours, which is exactly how long it will take until your change to 1800 seconds becomes finally active. If the TTL was longer than 86400 for whatever reason, note that into your list, too!)
DNS: lower TTL the day before the migration
After having created a list and checked which domains are currently active, set the default TTL time to 1800. (Just don't go below, 30 mins are short while you do the migration. Also the registrar might prefer you not to.)
DNS: plan b in case you have dozens of websites to migrate
If you have A LOT of websites that should go from one server to the next, try migrating and testing everything (via entries in the hosts file). Then switch the ip's of the servers with each other. That way no dns changes are needed (except if you have dead domains), because this shit can become tedious, too.
TBD / todo
Nothing more here now, until i am motivated again to write more stuff up.
The easiest way to locate the php error log location, is to use this on the shell:
php --info | grep error
View posts from 2017-03, 2017-02, 2017-01, 2016-12, 2016-11, 2016-10, 2016-09, 2016-08, 2016-07, 2016-06, 2016-05, 2016-04, 2016-03, 2016-02, 2016-01, 2015-12, 2015-11, 2015-10, 2015-09, 2015-08, 2015-07, 2015-06, 2015-05, 2015-04, 2015-03, 2015-02, 2015-01, 2014-12, 2014-11, 2014-10, 2014-09, 2014-08, 2014-07, 2014-06, 2014-05, 2014-04, 2014-03, 2014-01, 2013-12, 2013-11, 2013-10