With PHP I would advise you to use the Simple HTML Dom Parser, the best way to learn more about it is to look for samples on the ScraperWiki website.
NOTE: This answer was originally posted at StackOverflow.com by mnml
- Jacqueline answered 13 years ago
One general approach I haven’t seen mentioned here is to run HTML through Tidy, which can be set to spit out guaranteed-valid XHTML. Then you can use any old XML library on it.
But to your specific problem, you should take a look at this project: http://fivefilters.org/content-only/ — it’s a modified version of the Readability algorithm, which is designed to extract just the textual content (not headers and footers) from a page.
NOTE: This answer was originally posted at StackOverflow.com by Eli
- Joyce answered 14 years ago
This sounds like a good task description of W3C XPath technology. It’s easy to express queries like “return all href
attributes in img
tags that are nested in <foo><bar><baz> elements
.” Not being a PHP buff, I can’t tell you in what form XPath may be available. If you can call an external program to process the HTML file you should be able to use a command line version of XPath.
For a quick intro, see http://en.wikipedia.org/wiki/XPath.
NOTE: This answer was originally posted at StackOverflow.com by Jens
- Allen answered 14 years ago
DO NOT USE SELF:: use STATIC::
There is another aspect of self:: that is worth mentioning. Annoyingly self:: refers to the scope at the point of definition not at the point of execution. Consider this simple class with two methods:
class Person
{
public static function status()
{
self::getStatus();
}
protected static function getStatus()
{
echo "Person is alive";
}
}
If we call Person::status() we will see “Person is alive” . Now consider what happens when we make a class that inherits from this:
class Deceased extends Person
{
protected static function getStatus()
{
echo "Person is deceased";
}
}
Calling Deceased::status() we would expect to see “Person is deceased” however what we see is “Person is alive” as the scope contains the original method definition when call to self::getStatus() was defined.
PHP 5.3 has a solution. the static:: resolution operator implements “late static binding” which is a fancy way of saying that its bound to the scope of the class called. Change the line in status() to static::getStatus() and the results are what you would expect. In older versions of PHP you will have to find a kludge to do this.
http://php.net/manual/en/language.oop5.late-static-bindings.php
So to answer the question not as asked …
$this-> refers to the current object (an instance of a class), whereas static:: refers to a class
NOTE: This answer was originally posted at StackOverflow.com by Sqoo
- Jim answered 13 years ago
- last active 12 years ago
Here is an example of correct usage of $this and self for non-static
and static member variables:
<?php
class X {
private $non_static_member = 1;
private static $static_member = 2;
function __construct() {
echo $this->non_static_member . ' '
. self::$static_member;
}
}
new X();
?>
NOTE: This answer was originally posted at StackOverflow.com by Mohit Bumb
- Jane answered 13 years ago
The keyword self does NOT refer merely to the ‘current class’, at least not in a way that restricts you to static members. Within the context of a non-static member, self also provides a way of bypassing the vtable for the current object. Just as you can use parent::methodName()
to call the parents version of a function, so you can call self::methodName()
to call the current classes implementation of a method.
class Person {
private $name;
public function __construct($name) {
$this->name = $name;
}
public function getName() {
return $this->name;
}
public function getTitle() {
return $this->getName()." the person";
}
public function sayHello() {
echo "Hello, I'm ".$this->getTitle()."<br/>";
}
public function sayGoodbye() {
echo "Goodbye from ".self::getTitle()."<br/>";
}
}
class Geek extends Person {
public function __construct($name) {
parent::__construct($name);
}
public function getTitle() {
return $this->getName()." the geek";
}
}
$geekObj = new Geek("Ludwig");
$geekObj->sayHello();
$geekObj->sayGoodbye();
This will output:
Hello, I'm Ludwig the geek
Goodbye from Ludwig the person
sayHello()
uses the $this
pointer, so the vtable is invoked to call Geek::getTitle()
.
sayGoodbye()
uses self::getTitle()
, so the vtable is not used, and Person::getTitle()
is called. In both cases, we are dealing with the method of an instantiated object, and have access to the $this
pointer within the called functions.
NOTE: This answer was originally posted at StackOverflow.com by nbeagle
- Christine answered 15 years ago
- last active 13 years ago
I’m using Phpass which is a simple one-file PHP class that could be implemented very easily in nearly every PHP project. See also The H.
By default it used strongest available encryption that is implemented in Phpass, which is bcrypt
and falls back to other encryptions down to MD5 to provide backward compatibility to frameworks like Wordpress.
The returned hash could be stored in database as it is. Sample use for generating hash is:
$t_hasher = new PasswordHash(8, FALSE);
$hash = $t_hasher->HashPassword($password);
To verify password, one can use:
$t_hasher = new PasswordHash(8, FALSE);
$check = $t_hasher->CheckPassword($password, $hash);
NOTE: This answer was originally posted at StackOverflow.com by rabudde
- Rodney answered 12 years ago
TL;DR
Don’ts
- Don’t limit what characters users can enter for passwords. Only idiots do this.
- Don’t limit the length of a password. If your users want a sentence with supercalifragilisticexpialidocious in it, don’t prevent them from using it.
- Never store your user’s password in plain-text.
- Never email a password to your user except when they have lost theirs, and you sent a temporary one.
- Never, ever log passwords in any manner.
Do’s
- Use scrypt when you can; bcrypt if you cannot.
- Use PBKDF2 if you cannot use either bcrypt or scrypt.
- Reset everyone’s passwords when the database is compromised.
Why hash passwords anyway?
The objective behind hashing passwords is simple: preventing malicious access to user accounts by compromising the database. So the goal of password hashing is to deter a hacker or cracker by costing them too much time or money to calculate the plain-text passwords. And time/cost are the best deterrents in your arsenal.
Another reason that you want a good, robust hash on a user accounts is to give you enough time to change all the passwords in the system. If your database is compromised you will need enough time to at least lock the system down, if not change every password in the database.
Best practices
Bcrypt and scrypt are the current best practices. Scrypt will be better than bcrypt in time, but it hasn’t seen adoption as a standard by Linux/Unix or by webservers. If you are working with Ruby there is an scrypt gem that will help you out.
I highly suggest reading the documentation for the crypt function if you want to roll your own use of bcrypt, or finding yourself a good wrapper or use something like PHPASS for a more legacy implementation. I recommend a minimum of 12 rounds of bcrypt, if not 15 to 18.
I changed my mind about using bcrypt when I learned that bcrypt only uses blowfish’s key schedule, with a variable cost mechanism. The latter lets you increase the cost to brute-force a password by increasing blowfish’s already expensive key schedule.
Average practices
I almost can’t imagine this situation anymore. PHPASS supports PHP 3.0.18 through 5.3, so it is usable on almost every installation imaginable—and should be if you don’t know for certain that your environment supports bcrypt.
But suppose that you cannot use bcrypt or PHPASS at all. What then?
Try an implementation of PDKBF2 with the minimum number of rounds that your environment/application/user-perception can tolerate. The lowest number I’d recommend is 1000 rounds.
As I Said Last Time…
The computational power required to actually crack a hashed password doesn’t exist. The only way for computers to “crack” a password is to recreate it and simulate the hashing algorithm used to secure it. The speed of the hash is linearly related to its ability to be brute-forced. Worse still, most hash algorithms can be easily parallelized to be reproduced even faster. This is why costly schemes like bcrypt and scrypt are so important.
You cannot possibly foresee all threats or avenues of attack, and so you must make your best effort to protect your users up front. If you do not, then you might even miss the fact that you were attacked until it’s too late… and you’re liable. To avoid that situation, act paranoid to begin with. Attack your own software (internally) and attempt to steal log in information, or access other user’s accounts. If you don’t you cannot blame anyone but yourself.
Lastly: I am not a cryptographer. Whatever I’ve said is my opinion, but I happen to think it’s based on good ol’ common sense … and lots of reading. Remember, be as paranoid as possible, make things as hard to intrude as possible, and then, if you are still worried, contact a white-hat hacker or cryptographer to see what they say about your code/system.
NOTE: This answer was originally posted at StackOverflow.com by Robert K
- Bryan answered 16 years ago
- last active 13 years ago
A much shorter and safer answer – don’t write your own password mechanism at all, use one that is tried and tested, and incorporated into WordPress, Drupal etc, i.e. Openwall’s phpass.
Most programmers just don’t have the expertise to write crypto related code safely without introducing vulnerabilities.
See this excellent answer for more about why phpass is the best way to go.
NOTE: This answer was originally posted at StackOverflow.com by RichVel
- Anne answered 13 years ago
Wow I really didn’t know about this but its not a big code you can try echo “z” after loop Mark is Absolutely Right I use his method but if you want alternative then this may also you can try
<?php
for($i="a"; $i="y"; $i++)
{
echo "$i\n";
if($i=="z")
{
}
}
echo "z";
?>
NOTE: This answer was originally posted at StackOverflow.com by Mohit Bumb
- Jane answered 13 years ago
While the above answers are insightful to what’s going on, and pretty interesting (I didn’t know it would behave like this, and its good to see why.
The easiest fix (although perhaps not the most meaningful) would be just to change the condition to $i != ‘z’
<?php
for ($i = 'a'; $i != 'z'; $i++)
echo "$i\n";
?>
NOTE: This answer was originally posted at StackOverflow.com by jon_darkstar
- Nancy answered 14 years ago
- last active 13 years ago
Why not just use range('a','z')
?
NOTE: This answer was originally posted at StackOverflow.com by stillstanding
- Dean answered 14 years ago
Try this
1) In the wp-config.php add define(‘FS_METHOD’, ‘direct’);
2) Set the “wp-content” directory to 777 for writable.
3) Now install the plugin.
NOTE: This answer was originally posted at StackOverflow.com by Mohan Raj
- Vickie answered 12 years ago
- last active 12 years ago
The answer from stereointeractive covers all the options. Just wanted to mention an alternate way of using FTP. I’m guessing that the reason you are not allowing FTP access is for security. One way to address those security concerns is to run your FTP server listening only on 127.0.0.1
This allows you to use FTP from inside WordPress and you will be able to install plugins while not exposing it to the rest of the world. This can also be applied to other popular web applications such as Joomla! and Drupal. This is what we do with our BitNami appliances and cloud servers and works quite well.
NOTE: This answer was originally posted at StackOverflow.com by kaysa
- Tracy answered 13 years ago
I’ve been using WordPress more or less against my will for about two years now (since I started using other frameworks). I think there is no “good Rails alternative”. Of course there are a lot of blog engines but none have as many plugins available or are as well known with clients. Let’s be honest, WP has a fantastic front-end, clients seem to like that. The reason “we developers” look for Rails alternatives is obviously because Rails developer aren’t comfortable with WP. But there’s no platform out there that has the same out-of-the-box completeness and user friendliness as WP. For blog-like purposes that is of course.
NOTE: This answer was originally posted at StackOverflow.com by Jasper Kennis
- Carl answered 12 years ago
You may want to look into Wordscript ; it includes API’s written in Ruby and PHP that connects to existing wordpress databases and returns json structures (made from generated SQL).
Useful if you want to keep full administrative features of Wordpress, and have a somewhat simple Wordpress site. Neither version requires Wordpress to be installed locally, but you can’t really do anything administrative with the api or comments/custom fields (yet). Also the API is much faster and consumes a fraction of the resources Wordpress would.
NOTE: This answer was originally posted at StackOverflow.com by redcap3000
- Leslie answered 13 years ago
Although I’ve not used it, I’ve heard that Microsoft Orchard is pretty good.
NOTE: This answer was originally posted at StackOverflow.com by harriyott
- Arthur answered 13 years ago
I think for pepole with existing html, Toko Cms will be the cheapest option
NOTE: This answer was originally posted at StackOverflow.com by Miko
- Jane answered 15 years ago
- last active 14 years ago
WordPress fits well for a blogging setting and is relatively easy to adapt. I tried Drupal but I couldn’t get it to play well. I’m still considering what CMS functions best with a workflow of translators in multiple languages.
NOTE: This answer was originally posted at StackOverflow.com by SleekCC
- Michelle answered 14 years ago
The answer depends on the requirements. WordPress can be an excellent choice if your customer’s budget is very low
If they have some more budget and want something more, the take a look at the CMSes listed above. For ASP.NET I’ve used SiteCore and SiteFinity and have liked both because they allow a lot of flexibility over design and content. Plus, if I need to I can just get into the code and add my own user control to get something hard done.
NOTE: This answer was originally posted at StackOverflow.com by BeaverProj
- Lynn answered 16 years ago